Hatred Stems from Ignorance! Distillation of the Persuasion Modes in Countering Conversational Hate Speech

Read original: arXiv:2403.15449 - Published 7/17/2024 by Ghadi Alyahya, Abeer Aldayel

Hatred Stems from Ignorance! Distillation of the Persuasion Modes in Countering Conversational Hate Speech

Overview

• This paper explores the use of persuasion modes to counter conversational hate speech online. • The authors investigate how different persuasion strategies, such as logos, pathos, and ethos, can be effective in combating hate speech. • The study examines the impact of these persuasion modes on both the hate speech authors and the broader online community.

Plain English Explanation

The paper focuses on how to effectively respond to hate speech online using different persuasion techniques. It looks at three main types of persuasion: logos (using logic and reasoning), pathos (appealing to emotions), and ethos (building credibility). The researchers explore how these persuasion strategies can be used to counter hate speech and change the minds of those who engage in it. They also look at the impact of these responses on the broader online community.

Technical Explanation

The paper investigates the use of different persuasion modes, such as logos, pathos, and ethos, to counter conversational hate speech online. The authors conducted experiments to evaluate the effectiveness of these persuasion strategies in changing the attitudes and behaviors of hate speech authors, as well as the broader online community. The results suggest that a combination of these persuasion modes can be effective in addressing hate speech and promoting more constructive dialogue.

Critical Analysis

The paper provides valuable insights into the use of persuasion strategies to counter hate speech, but it also acknowledges some limitations. For example, the researchers note that the effectiveness of these approaches may vary depending on the specific context and the characteristics of the hate speech authors. Additionally, the paper suggests that further research is needed to explore the long-term impacts of these counter-speech interventions and to investigate how they may interact with other factors, such as platform design and moderation policies. Concerns have been raised about the potential for some counter-speech strategies to inadvertently reinforce or amplify hate, and the authors acknowledge the need to carefully consider these potential unintended consequences.

Conclusion

This paper provides a valuable contribution to the ongoing efforts to address hate speech online. By exploring the use of different persuasion modes, the authors offer insights into how we can more effectively counter hate speech and promote constructive dialogue. While the research highlights the potential of these approaches, it also underscores the need for continued study and the careful consideration of potential limitations and unintended consequences. Ultimately, the paper serves as an important step forward in understanding how to build a more inclusive and tolerant online environment.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Hatred Stems from Ignorance! Distillation of the Persuasion Modes in Countering Conversational Hate Speech

Ghadi Alyahya, Abeer Aldayel

Examining the factors that the counterspeech uses are at the core of understanding the optimal methods for confronting hate speech online. Various studies have assessed the emotional base factors used in counter speech, such as emotional empathy, offensiveness, and hostility. To better understand the counterspeech used in conversations, this study distills persuasion modes into reason, emotion, and credibility and evaluates their use in two types of conversation interactions: closed (multi-turn) and open (single-turn) concerning racism, sexism, and religious bigotry. The evaluation covers the distinct behaviors seen with human-sourced as opposed to machine-generated counterspeech. It also assesses the interplay between the stance taken and the mode of persuasion seen in the counterspeech. Notably, we observe nuanced differences in the counterspeech persuasion modes used in open and closed interactions, especially in terms of the topic, with a general tendency to use reason as a persuasion mode to express the counterpoint to hate comments. The machine-generated counterspeech tends to exhibit an emotional persuasion mode, while human counters lean toward reason. Furthermore, our study shows that reason tends to obtain more supportive replies than other persuasion modes. The findings highlight the potential for incorporating persuasion modes into studies about countering hate speech, as they can serve as an optimal means of explainability and pave the way for the further adoption of the reply's stance and the role it plays in assessing what comprises the optimal counterspeech.

7/17/2024

📈

NLP for Counterspeech against Hate: A Survey and How-To Guide

Helena Bonaldi, Yi-Ling Chung, Gavin Abercrombie, Marco Guerini

In recent years, counterspeech has emerged as one of the most promising strategies to fight online hate. These non-escalatory responses tackle online abuse while preserving the freedom of speech of the users, and can have a tangible impact in reducing online and offline violence. Recently, there has been growing interest from the Natural Language Processing (NLP) community in addressing the challenges of analysing, collecting, classifying, and automatically generating counterspeech, to reduce the huge burden of manually producing it. In particular, researchers have taken different directions in addressing these challenges, thus providing a variety of related tasks and resources. In this paper, we provide a guide for doing research on counterspeech, by describing - with detailed examples - the steps to undertake, and providing best practices that can be learnt from the NLP studies on this topic. Finally, we discuss open challenges and future directions of counterspeech research in NLP.

4/1/2024

Hostile Counterspeech Drives Users From Hate Subreddits

Daniel Hickey, Matheus Schmitz, Daniel M. T. Fessler, Paul E. Smaldino, Kristina Lerman, Goran Muri'c, Keith Burghardt

Counterspeech -- speech that opposes hate speech -- has gained significant attention recently as a strategy to reduce hate on social media. While previous studies suggest that counterspeech can somewhat reduce hate speech, little is known about its effects on participation in online hate communities, nor which counterspeech tactics reduce harmful behavior. We begin to address these gaps by identifying 25 large hate communities (subreddits) within Reddit and analyzing the effect of counterspeech on newcomers within these communities. We first construct a new public dataset of carefully annotated counterspeech and non-counterspeech comments within these subreddits. We use this dataset to train a state-of-the-art counterspeech detection model. Next, we use matching to evaluate the causal effects of hostile and non-hostile counterspeech on the engagement of newcomers in hate subreddits. We find that, while non-hostile counterspeech is ineffective at keeping users from fully disengaging from these hate subreddits, a single hostile counterspeech comment substantially reduces both future likelihood of engagement. While offering nuance to the understanding of counterspeech efficacy, these results a) leave unanswered the question of whether hostile counterspeech dissuades newcomers from participation in online hate writ large, or merely drives them into less-moderated and more extreme hate communities, and b) raises ethical considerations about hostile counterspeech, which is both comparatively common and might exacerbate rather than mitigate the net level of antagonism in society. These findings underscore the importance of future work to improve counterspeech tactics and minimize unintended harm.

5/29/2024

💬

A Multi-Aspect Framework for Counter Narrative Evaluation using Large Language Models

Jaylen Jones, Lingbo Mo, Eric Fosler-Lussier, Huan Sun

Counter narratives - informed responses to hate speech contexts designed to refute hateful claims and de-escalate encounters - have emerged as an effective hate speech intervention strategy. While previous work has proposed automatic counter narrative generation methods to aid manual interventions, the evaluation of these approaches remains underdeveloped. Previous automatic metrics for counter narrative evaluation lack alignment with human judgment as they rely on superficial reference comparisons instead of incorporating key aspects of counter narrative quality as evaluation criteria. To address prior evaluation limitations, we propose a novel evaluation framework prompting LLMs to provide scores and feedback for generated counter narrative candidates using 5 defined aspects derived from guidelines from counter narrative specialized NGOs. We found that LLM evaluators achieve strong alignment to human-annotated scores and feedback and outperform alternative metrics, indicating their potential as multi-aspect, reference-free and interpretable evaluators for counter narrative evaluation.

4/1/2024