A multitask learning framework for leveraging subjectivity of annotators to identify misogyny

2406.15869

Published 6/26/2024 by Jason Angel, Segun Taofeek Aroyehun, Grigori Sidorov, Alexander Gelbukh

A multitask learning framework for leveraging subjectivity of annotators to identify misogyny

Abstract

Identifying misogyny using artificial intelligence is a form of combating online toxicity against women. However, the subjective nature of interpreting misogyny poses a significant challenge to model the phenomenon. In this paper, we propose a multitask learning approach that leverages the subjectivity of this task to enhance the performance of the misogyny identification systems. We incorporated diverse perspectives from annotators in our model design, considering gender and age across six profile groups, and conducted extensive experiments and error analysis using two language models to validate our four alternative designs of the multitask learning technique to identify misogynistic content in English tweets. The results demonstrate that incorporating various viewpoints enhances the language models' ability to interpret different forms of misogyny. This research advances content moderation and highlights the importance of embracing diverse perspectives to build effective online moderation systems.

Create account to get full access

Overview

This paper proposes a multitask learning framework to leverage the subjectivity of annotators in order to identify misogyny in online content.
The framework aims to capture the diverse perspectives of annotators and use this information to improve the performance of misogyny detection models.
The authors evaluate their approach on several datasets and demonstrate improvements over existing methods.

Plain English Explanation

Identifying misogyny, or hatred towards women, in online content is an important but challenging task. This is because people's perceptions of what constitutes misogyny can vary quite a bit. What one person might consider misogynistic, another person might not.

<a href="https://aimodels.fyi/papers/arxiv/sexism-detection-data-diet">Previous research</a> has shown that the way data is annotated for training machine learning models can have a big impact on model performance. If the annotators don't agree on what should be considered misogynistic, the models will struggle to learn a consistent definition.

This paper tries to address this issue by developing a new machine learning framework that

leverages

the

subjectivity

of the annotators. Instead of trying to force the annotators to agree, the framework

captures

their diverse perspectives and

uses

that information to build a more robust misogyny detection model.

The key idea is to frame misogyny detection as a

multitask learning

problem. The model is trained not only to identify misogyny, but also to predict how each individual annotator would label a given piece of text. By learning to predict the annotators' judgments, the model can better understand the nuances of what constitutes misogyny according to different people.

<a href="https://aimodels.fyi/papers/arxiv/noise-correction-subjective-datasets">The authors argue</a> that this approach can help overcome issues with

noisy

subjective

annotation data, which is common when dealing with complex social phenomena like misogyny.

Technical Explanation

The paper proposes a

multitask learning framework

that jointly learns to predict misogyny labels and annotator-specific labels. The

main task

is to identify whether a given text is misogynistic or not. The

auxiliary tasks

are to predict how each individual annotator would label the text.

The model architecture consists of a shared

base encoder

that encodes the input text, and separate

task-specific heads

that make predictions for the main misogyny task and the auxiliary annotator prediction tasks. The base encoder learns representations that are useful for both the main and auxiliary tasks, while the task-specific heads allow the model to capture the nuances of different annotators' perspectives.

During training, the model is optimized to minimize the combined loss across the main misogyny task and the auxiliary annotator prediction tasks. This encourages the model to learn representations that are not only useful for detecting misogyny, but also capture the

subjectivity

inherent in how different annotators might judge the same text.

The authors evaluate their approach on several

misogyny detection datasets

, including <a href="https://aimodels.fyi/papers/arxiv/pejorativity-disambiguating-pejorative-epithets-to-improve-misogyny">TRAC</a>, <a href="https://aimodels.fyi/papers/arxiv/investigating-annotator-bias-large-language-models-hate">HatEval</a>, and <a href="https://aimodels.fyi/papers/arxiv/bilingual-sexism-classification-fine-tuned-xlm-roberta">AMI</a>. They demonstrate that their multitask learning framework outperforms

baseline models

that do not explicitly account for annotator subjectivity.

Critical Analysis

The paper presents a novel and promising approach to addressing the challenges of

subjective annotation data

in the context of misogyny detection. By leveraging the diverse perspectives of annotators, the multitask learning framework can potentially learn more robust and nuanced representations of what constitutes misogynistic content.

However, the authors acknowledge that their approach has some limitations. For example, the framework assumes that the annotators' judgments are

consistent

over time, which may not always be the case. Additionally, the model relies on having

access to the individual annotator labels

, which may not be available in all real-world scenarios.

Further research could explore ways to

relax these assumptions

, such as by incorporating

dynamic modeling of annotator subjectivity

leveraging unlabeled data

to learn more generalizable representations. It would also be valuable to investigate the

interpretability

of the learned representations, to better understand how the model is capturing the nuances of misogyny as perceived by different annotators.

Conclusion

This paper presents a compelling approach to improving misogyny detection by

explicitly

modeling the

subjectivity

of annotators. By framing the problem as a multitask learning task, the proposed framework can leverage the diverse perspectives of annotators to build more robust and nuanced models.

The findings of this research have important implications for the development of

accurate

and

fair

content moderation systems, which are crucial for creating

safer

and

more inclusive

online environments. The authors' work also highlights the importance of

carefully

designing

annotation processes

and

understanding the limitations of subjective data

in the context of complex social phenomena.

Overall, this paper offers a promising direction for

advancing the state of the art

in misogyny detection and

addressing the challenges

inherent in working with

subjective data

in machine learning applications.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Sexism Detection on a Data Diet

Rabiraj Bandyopadhyay, Dennis Assenmacher, Jose M. Alonso Moral, Claudia Wagner

There is an increase in the proliferation of online hate commensurate with the rise in the usage of social media. In response, there is also a significant advancement in the creation of automated tools aimed at identifying harmful text content using approaches grounded in Natural Language Processing and Deep Learning. Although it is known that training Deep Learning models require a substantial amount of annotated data, recent line of work suggests that models trained on specific subsets of the data still retain performance comparable to the model that was trained on the full dataset. In this work, we show how we can leverage influence scores to estimate the importance of a data point while training a model and designing a pruning strategy applied to the case of sexism detection. We evaluate the model performance trained on data pruned with different pruning strategies on three out-of-domain datasets and find, that in accordance with other work a large fraction of instances can be removed without significant performance drop. However, we also discover that the strategies for pruning data, previously successful in Natural Language Inference tasks, do not readily apply to the detection of harmful content and instead amplify the already prevalent class imbalance even more, leading in the worst-case to a complete absence of the hateful class.

6/10/2024

cs.CL

🌀

Noise Correction on Subjective Datasets

Uthman Jinadu, Yi Ding

Incorporating every annotator's perspective is crucial for unbiased data modeling. Annotator fatigue and changing opinions over time can distort dataset annotations. To combat this, we propose to learn a more accurate representation of diverse opinions by utilizing multitask learning in conjunction with loss-based label correction. We show that using our novel formulation, we can cleanly separate agreeing and disagreeing annotations. Furthermore, this method provides a controllable way to encourage or discourage disagreement. We demonstrate that this modification can improve prediction performance in a single or multi-annotator setting. Lastly, we show that this method remains robust to additional label noise that is applied to subjective data.

6/5/2024

cs.LG cs.AI cs.HC

PejorativITy: Disambiguating Pejorative Epithets to Improve Misogyny Detection in Italian Tweets

Arianna Muti, Federico Ruggeri, Cagri Toraman, Lorenzo Musetti, Samuel Algherini, Silvia Ronchi, Gianmarco Saretto, Caterina Zapparoli, Alberto Barr'on-Cede~no

Misogyny is often expressed through figurative language. Some neutral words can assume a negative connotation when functioning as pejorative epithets. Disambiguating the meaning of such terms might help the detection of misogyny. In order to address such task, we present PejorativITy, a novel corpus of 1,200 manually annotated Italian tweets for pejorative language at the word level and misogyny at the sentence level. We evaluate the impact of injecting information about disambiguated words into a model targeting misogyny detection. In particular, we explore two different approaches for injection: concatenation of pejorative information and substitution of ambiguous words with univocal terms. Our experimental results, both on our corpus and on two popular benchmarks on Italian tweets, show that both approaches lead to a major classification improvement, indicating that word sense disambiguation is a promising preliminary step for misogyny detection. Furthermore, we investigate LLMs' understanding of pejorative epithets by means of contextual word embeddings analysis and prompting.

4/4/2024

cs.CL cs.AI

Investigating Annotator Bias in Large Language Models for Hate Speech Detection

Amit Das, Zheng Zhang, Fatemeh Jamshidi, Vinija Jain, Aman Chadha, Nilanjana Raychawdhary, Mary Sandage, Lauramarie Pope, Gerry Dozier, Cheryl Seals

Data annotation, the practice of assigning descriptive labels to raw data, is pivotal in optimizing the performance of machine learning models. However, it is a resource-intensive process susceptible to biases introduced by annotators. The emergence of sophisticated Large Language Models (LLMs), like ChatGPT presents a unique opportunity to modernize and streamline this complex procedure. While existing research extensively evaluates the efficacy of LLMs, as annotators, this paper delves into the biases present in LLMs, specifically GPT 3.5 and GPT 4o when annotating hate speech data. Our research contributes to understanding biases in four key categories: gender, race, religion, and disability. Specifically targeting highly vulnerable groups within these categories, we analyze annotator biases. Furthermore, we conduct a comprehensive examination of potential factors contributing to these biases by scrutinizing the annotated data. We introduce our custom hate speech detection dataset, HateSpeechCorpus, to conduct this research. Additionally, we perform the same experiments on the ETHOS (Mollas et al., 2022) dataset also for comparative analysis. This paper serves as a crucial resource, guiding researchers and practitioners in harnessing the potential of LLMs for dataannotation, thereby fostering advancements in this critical field. The HateSpeechCorpus dataset is available here: https://github.com/AmitDasRup123/HateSpeechCorpus

6/19/2024

cs.CL cs.AI cs.LG