Cyberbullying Detection through Machine Learning
The thesis paper we adopted our model from is called “Understanding and Fighting Bullying with Machine Learning” and was crafted by University of Wisconsin-Madison student Junming Sui. The thesis work proves that the use of social media in conjunction with machine learning and natural language processing provides unprecedented amounts of data for the study of bullying in young people. The data collected on social media comes from bullying victims taking to their personal sites to recount their experience. Sui defines the social media posts, such as a tweet on Twitter, as a “bullying trace” and the actual bullying experience as a “bullying episode.” The machine learning model discussed in the thesis takes these bullying traces and uses them to better understand events of bullying not limited to cyberbullying. The model we chose to run is a probabilistic model that assigns tweets in a user’s Twitter timeline to their corresponding bully episodes. Essentially the model uses hashtags and deletion in bullying traces to identify important factors in bullying. As bullying is common in schools across the country and can have lasting effects on adolescents, this is a machine learning application that is very important to us as a team.
Sean Jocher, Brad Ferguson, Adrian Gavrila, Cassie Turberville