AggregHate: An Efficient Aggregative Approach for the Detection of Hatemongers on Social Platforms

Read original: arXiv:2409.14464 - Published 9/24/2024 by Tom Marzea, Abraham Israeli, Oren Tsur

AggregHate: An Efficient Aggregative Approach for the Detection of Hatemongers on Social Platforms

Overview

This paper proposes a novel approach called "AggregHate" for efficiently detecting hatemongers on social media platforms.
The method leverages an aggregative approach to identify users who consistently engage in hateful behavior over time.
The authors evaluate the performance of AggregHate on real-world social media data and compare it to existing detection techniques.

Plain English Explanation

The researchers developed a new system called "AggregHate" to help find people on social media who repeatedly post hateful or extreme content.

[Introduction] Social media platforms have become breeding grounds for online hate speech and extremism. Identifying the users behind this harmful content is an important but challenging task. Existing methods often rely on analyzing individual posts, which can miss users who spread hate more subtly over time.

[Approach] To address this, the AggregHate system takes an "aggregative" approach. Instead of just looking at single posts, it examines a user's overall behavior and history on the platform. This allows it to spot users who consistently engage in hateful activities, even if their individual posts don't seem extreme.

[Evaluation] The researchers tested AggregHate on real social media data and found that it outperformed other popular hate detection methods. It was able to identify hatemongers more accurately by considering their full activity patterns rather than just isolated incidents.

[Significance] By focusing on repeat offenders rather than one-off posts, AggregHate provides a more holistic and effective way to combat online hate speech and extremism. This could help social media platforms proactively address these serious issues and maintain healthier communities.

Technical Explanation

[Introduction] The paper introduces the problem of detecting hatemongers, or users who consistently engage in hateful behavior, on social media platforms. Existing detection methods often rely on analyzing the content of individual posts, which can miss users who spread hate more subtly over time.

[Approach] To address this, the authors propose a novel "aggregative" approach called AggregHate. Instead of just looking at single posts, AggregHate examines a user's overall activity and behavioral patterns on the platform. This allows it to identify users who consistently exhibit hateful behavior, even if their individual posts do not appear extreme.

[Architecture] The AggregHate system consists of three main components: 1) a user behavior modeling module that tracks each user's activity over time, 2) a hate speech detection module that classifies the content of individual posts, and 3) an aggregation module that combines the user-level and post-level signals to identify consistent hatemongers.

[Evaluation] The authors evaluate AggregHate on real-world social media data and compare its performance to existing hate detection techniques. They find that AggregHate is able to more accurately identify hatemongers by considering their full activity history, rather than just isolated incidents.

Critical Analysis

The paper presents a well-designed and thorough evaluation of the AggregHate system. However, the authors acknowledge some limitations and areas for future work:

[Limitations] The dataset used for evaluation may not fully represent the diversity of hate speech and extremism on social media. Additionally, the hate speech detection module relies on potentially biased training data, which could introduce systematic errors.

[Future Work] The authors suggest exploring ways to make the AggregHate system more robust to evasion tactics, where hatemongers might intentionally modify their behavior to avoid detection. Incorporating additional behavioral signals or using more advanced machine learning techniques could also improve the system's performance.

Overall, the AggregHate approach represents a promising step forward in the ongoing challenge of combating online hate speech and extremism. By focusing on repeat offenders rather than individual posts, the system offers a more comprehensive and effective solution for social media platforms.

Conclusion

The AggregHate paper presents a novel approach for efficiently detecting hatemongers on social media platforms. By taking an "aggregative" perspective that examines users' overall behavioral patterns, rather than just individual posts, the system is able to more accurately identify repeat offenders who consistently engage in hateful activities.

The authors' thorough evaluation demonstrates the advantages of this approach over existing hate detection methods. While the system has some limitations, the core ideas behind AggregHate offer a valuable contribution to the ongoing efforts to address the growing problem of online hate speech and extremism.

As social media continues to play an increasingly central role in our lives, tools like AggregHate will be crucial for platforms to maintain healthy, inclusive communities. The paper's focus on identifying persistent hatemongers represents an important step forward in this critical endeavor.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

AggregHate: An Efficient Aggregative Approach for the Detection of Hatemongers on Social Platforms

Tom Marzea, Abraham Israeli, Oren Tsur

Automatic detection of online hate speech serves as a crucial step in the detoxification of the online discourse. Moreover, accurate classification can promote a better understanding of the proliferation of hate as a social phenomenon. While most prior work focus on the detection of hateful utterances, we argue that focusing on the user level is as important, albeit challenging. In this paper we consider a multimodal aggregative approach for the detection of hate-mongers, taking into account the potentially hateful texts, user activity, and the user network. We evaluate our methods on three unique datasets X (Twitter), Gab, and Parler showing that a processing a user's texts in her social context significantly improves the detection of hate mongers, compared to previously used text and graph-based methods. Our method can be then used to improve the classification of coded messages, dog-whistling, and racial gas-lighting, as well as inform intervention measures. Moreover, our approach is highly efficient even for very large datasets and networks.

9/24/2024

Exploiting Hatred by Targets for Hate Speech Detection on Vietnamese Social Media Texts

Cuong Nhat Vo, Khanh Bao Huynh, Son T. Luu, Trong-Hop Do

The growth of social networks makes toxic content spread rapidly. Hate speech detection is a task to help decrease the number of harmful comments. With the diversity in the hate speech created by users, it is necessary to interpret the hate speech besides detecting it. Hence, we propose a methodology to construct a system for targeted hate speech detection from online streaming texts from social media. We first introduce the ViTHSD - a targeted hate speech detection dataset for Vietnamese Social Media Texts. The dataset contains 10K comments, each comment is labeled to specific targets with three levels: clean, offensive, and hate. There are 5 targets in the dataset, and each target is labeled with the corresponding level manually by humans with strict annotation guidelines. The inter-annotator agreement obtained from the dataset is 0.45 by Cohen's Kappa index, which is indicated as a moderate level. Then, we construct a baseline for this task by combining the Bi-GRU-LSTM-CNN with the pre-trained language model to leverage the power of text representation of BERTology. Finally, we suggest a methodology to integrate the baseline model for targeted hate speech detection into the online streaming system for practical application in preventing hateful and offensive content on social media.

5/1/2024

Exploring Boundaries and Intensities in Offensive and Hate Speech: Unveiling the Complex Spectrum of Social Media Discourse

Abinew Ali Ayele, Esubalew Alemneh Jalew, Adem Chanie Ali, Seid Muhie Yimam, Chris Biemann

The prevalence of digital media and evolving sociopolitical dynamics have significantly amplified the dissemination of hateful content. Existing studies mainly focus on classifying texts into binary categories, often overlooking the continuous spectrum of offensiveness and hatefulness inherent in the text. In this research, we present an extensive benchmark dataset for Amharic, comprising 8,258 tweets annotated for three distinct tasks: category classification, identification of hate targets, and rating offensiveness and hatefulness intensities. Our study highlights that a considerable majority of tweets belong to the less offensive and less hate intensity levels, underscoring the need for early interventions by stakeholders. The prevalence of ethnic and political hatred targets, with significant overlaps in our dataset, emphasizes the complex relationships within Ethiopia's sociopolitical landscape. We build classification and regression models and investigate the efficacy of models in handling these tasks. Our results reveal that hate and offensive speech can not be addressed by a simplistic binary classification, instead manifesting as variables across a continuous range of values. The Afro-XLMR-large model exhibits the best performances achieving F1-scores of 75.30%, 70.59%, and 29.42% for the category, target, and regression tasks, respectively. The 80.22% correlation coefficient of the Afro-XLMR-large model indicates strong alignments.

4/19/2024

💬

Moderating New Waves of Online Hate with Chain-of-Thought Reasoning in Large Language Models

Nishant Vishwamitra, Keyan Guo, Farhan Tajwar Romit, Isabelle Ondracek, Long Cheng, Ziming Zhao, Hongxin Hu

Online hate is an escalating problem that negatively impacts the lives of Internet users, and is also subject to rapid changes due to evolving events, resulting in new waves of online hate that pose a critical threat. Detecting and mitigating these new waves present two key challenges: it demands reasoning-based complex decision-making to determine the presence of hateful content, and the limited availability of training samples hinders updating the detection model. To address this critical issue, we present a novel framework called HATEGUARD for effectively moderating new waves of online hate. HATEGUARD employs a reasoning-based approach that leverages the recently introduced chain-of-thought (CoT) prompting technique, harnessing the capabilities of large language models (LLMs). HATEGUARD further achieves prompt-based zero-shot detection by automatically generating and updating detection prompts with new derogatory terms and targets in new wave samples to effectively address new waves of online hate. To demonstrate the effectiveness of our approach, we compile a new dataset consisting of tweets related to three recently witnessed new waves: the 2022 Russian invasion of Ukraine, the 2021 insurrection of the US Capitol, and the COVID-19 pandemic. Our studies reveal crucial longitudinal patterns in these new waves concerning the evolution of events and the pressing need for techniques to rapidly update existing moderation tools to counteract them. Comparative evaluations against state-of-the-art tools illustrate the superiority of our framework, showcasing a substantial 22.22% to 83.33% improvement in detecting the three new waves of online hate. Our work highlights the severe threat posed by the emergence of new waves of online hate and represents a paradigm shift in addressing this threat practically.

5/13/2024