Context-Aware Harassment Detection on Social Media
Context-Aware Harassment Detection on Social Media is an inter-disciplinary project among the Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis), the Department of Psychology, and Center for Urban and Public Affairs (CUPA) at Wright State University. The aim of this project is to develop comprehensive and reliable context-aware techniques (using machine learning, text mining, natural language processing, and social network analysis) to glean information about the people involved and their interconnected network of relationships, and to determine and evaluate potential harassment and harassers. An interdisciplinary team of computer scientists, social scientists, urban and public affairs professionals, educators, and the participation of college and high schools students in the research will ensure wide impact of scientific research on the support for safe social interactions.
As social media permeates our daily life, there has been a sharp rise in the use of social media to humiliate, bully, and threaten others, which has come with harmful consequences such as emotional distress, depression, and suicide. The October 2014 Pew Research survey <ref>Pew Internet, Online Harassment, 2014.</ref> shows that 73% of adult Internet users have observed online harassment and 40% have experienced it. Most of those who have experienced online harassment, 66% said their most recent incident occurred on a social networking site or app. Further, 25% of teens claim to have been cyberbullied <ref>Cyberbullying Research Center, Cyberbullying Facts, 2012.</ref>. The prevalence and serious consequences of online harassment present both social and technological challenges.
Existing work on harassment detection usually applies machine learning for binary classification, relying on message content while ignoring message context. Harassment is a pragmatic phenomenon, necessarily context-sensitive. We identify three dimensions of context for social media, people, content, and network, for the harassment phenomenon. Focusing on content, but ignoring either people (offender and victim) or network (social networks of offender and victim) yields misleading results. An apparent "bullying conversation" between good friends with sarcastic content presents no serious threat, while the same content from an identifiable stranger may function as harassment. Content analysis alone cannot capture these subtle but important distinctions.
Social science research identifies some of the necessary harassment components and features typically ignored in the existing binary harassment-or-not computation: (1) aggressive/offensive language, (2) potentially harmful consequences to emotion, such as distress and psychological trauma, and (3) a deliberate intent to harm. We investigate novel language analysis techniques that examine the target-dependent offensiveness/negativity of a message, including the notion of target (recipient) sensitivity missing in existing harassment detection systems. The harassment value depends further on the resulting emotional harm and the intent of the sender. Thus, we reframe social media harassment detection as a multi-dimensional analysis of the degree to which harassment occurs. The specific research goals of this proposal are:
- Mohammadreza Rezvan, Saeedeh Shekarpour, Lakshika Balasuriya, Prof. Krishnaprasad Thirunarayan, Prof. Valerie L. Shalin, Amit Sheth. A Quality Type-aware Annotated Corpus and Lexicon for Harassment Research, 10th ACM Conference on Web Science (WeSci'18) Amsterdam, The Netherlands, 27-30 May 2018 (Nominated for the best paper award) [Published].
Principal Investigators: Prof. Amit P. Sheth
Co-Investigators: Prof. Valerie L. Shalin, Prof. Krishnaprasad Thirunarayan
Postdoctoral Researchers:Dr. Saeedeh Shekarpour
Graduate Students:Thilini Wijesiriwardene
Visiting Scholars:Mohammadreza Rezvan, Ugur Kursuncu
Other Collaborators: Prof. Debra Steele-Johnson, Dr. Jack L. Dustin
Past Members: Monireh Ebrahimi, Lu Chen, Wenbo Wang, Pranav Karan, Rajeshwari Kandakatla, Venkatesh Edupuganti
Contact: Lu Chen
Follow us on Twitter
- Technology for technology-induced disease: Wright State researchers using computer analysis as weapon against harassment, Wright State University Newsroom, Nov 10, 2015.
- Wright State University researchers developing new software that can detect cyber-bullying, WDTN (NBC (Channel 2), Nov. 19, 2015.
- Wright State launches cyber bullying research, coverage in WHIO TV (Channel 7), Nov. 23, 2015.
- Hazards SEES: Social and Physical Sensing Enabled Decision Support for Disaster Management and Response (NSF)
- Modeling Social Behavior for Healthcare Utilization in Depression (NIH)
- Project Safe Neighborhood
- eDrugTrends (NIH)
- Innovative NIDA National Early Warning Sysetm Network (iN3)
- Market Driven Innovations and Scaling up of Twitris
- KHealth: Semantic Multisensory Mobile Approach to Personalized Asthma Care
- SoCS: Social Media Enhanced Organizational Sensemaking in Emergency Response (NSF)
- Twitris: a System for Collective Social Intelligence
- PREDOSE: PREscription Drug abuse Online Surveillance and Epidemiology
- A painfully funny but informative introduction to the problem of online harassment: https://www.youtube.com/watch?v=PuNIwYsz7PI
- Why People Post Benevolent and Malicious Comments Online: https://vimeo.com/141448254
- Ugur Kursuncu, Manas Gaur, Usha Lokala, Krishnaprasad Thirunarayan, Amit Sheth and I. Budak Arpinar. "Predictive Analysis on Twitter: Techniques and Applications". Book Chapter in "Emerging Research Challenges and Opportunities in Computational Social Network Analysis and Mining", Editor: Nitin Agarwal, Springer, 2018.
- Lu Chen, Justin Martineau, Doreen Cheng and Amit Sheth. "Clustering for Simultaneous Extraction of Aspects and Features from Reviews" Proceedings of the 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL); 2016.
- Sujan Perera, Pablo N. Mendes, Adarsh Alex, Amit P. Sheth, and Krishnaprasad Thirunarayan."Implicit Entity Linking in Tweets"In International Semantic Web Conference, pp. 118-132. Springer International Publishing; 2016.
- Lakshika Balasuriya, Sanjaya Wijeratne, Derek Doran, Amit Sheth. "Finding Street Gang Members on Twitter" In 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2016). San Francisco, CA, USA; 2016.
- Sanjaya Wijeratne, Lakshika Balasuriya, Derek Doran, Amit Sheth. "Word Embeddings to Enhance Twitter Gang Member Profile Identification" In IJCAI Workshop on Semantic Machine Learning (SML 2016). New York City, NY: CEUR-WS; 2016.
- Sanjaya Wijeratne, Lakshika Balasuriya, Amit Sheth, Derek Doran. EmojiNet: An Open Service and API for Emoji Sense Discovery, In 11th International AAAI Conference on Web and Social Media (ICWSM 2017). Montreal, Canada; 2017. Demo | BibTeX
- Sanjaya Wijeratne, Lakshika Balasuriya, Amit Sheth, Derek Doran. A Semantics-Based Measure of Emoji Similarity, In 2017 IEEE/WIC/ACM International Conference on Web Intelligence (WI). Leipzig, Germany; 2017. Demo
- Sanjaya Wijeratne, Amit Sheth, Shreyansh Bhatt, Lakshika Balasuriya, Hussein Al-Olimat, Manas Gaur, Amir Hossein Yazdavar, Krishnaprasad Thirunarayan. "Feature Engineering for Twitter-based Applications", in Feature Engineering for Machine Learning and Data Analytics. Editors. Guozhu Dong and Huan Liu. Chapman and Hall/CRC Data Mining and Knowledge Discovery Series. pp 359-393, March, 2018.