NJUST-KMG at TRAC-2024 Tasks 1 and 2: Offline Harm Potential Identification

Read original: arXiv:2403.19713 - Published 4/1/2024 by Jingyuan Wang, Shengdong Xu, Yang Yang

🖼️

Introduction

The TRAC-2024 Offline Harm Potential Identification task focuses on predicting whether online posts in various Indian languages could lead to offline harm events like riots, mob lynching, murder, or rape. The exponential growth of digital platforms necessitates monitoring diverse, multilingual content to prevent detrimental social consequences. The task emphasizes understanding nuanced implications in conversations across different languages and cultures.

The system leverages advanced pretrained models with contrastive learning to harness rich representations and tailor them to the specific context through fine-tuning. Contrastive learning enhances the model's ability to discern subtleties within multilingual content, making it more robust against linguistic and semantic complexities. An ensemble strategy at the testing phase solidifies the strengths of diverse models and ensures resilience and generalization across different data points.

Participating in the TRAC-2024 task provided insights into content moderation and harm prediction in cross-linguistic and cultural contexts. The decision to integrate contrastive learning was driven by empirical observations during development, where the model struggled to distinguish between the top three categories of harm potential. Contrastive learning increases the distance in the feature space between harm potential categories, reducing ambiguity and improving classification precision. This methodological pivot was instrumental in addressing the nuances of multilingual content, which requires a delicate balance of linguistic subtlety and cultural awareness to accurately identify and categorize harm potential indicators.

Background

The TRAC-2024 challenge consisted of two sub-tasks to evaluate the offline harm potential of online content. The input was social media text data in various Indian languages, annotated to assess harm potential. Sub-task 1a required a four-tier classification predicting the potential of a document to cause offline harm, from 'harmless' to 'highly likely to incite harm.' Sub-task 1b involved predicting the potential target identities impacted by the harm, classifying them into categories like gender, religion, and political ideology.

The authors participated in sub-task 1a, utilizing their expertise in dealing with the nuances of context and language. Their approach drew inspiration from existing research on using pretrained models for text classification, such as the methods described in the development of BERT. The authors' novel contribution was the incorporation of contrastive learning to refine these models within the multilingual context of Indian social media. This implementation aimed to enhance differentiation among closely related content categories, addressing the challenge of high intra-class variation and inter-class similarity. The method introduced an effective distinction among content rated with varying levels of harm potential, innovating within the established realm of text classification.

Method

The paper discusses the base models and strategies used for the TRAC-2024 competition. The authors compared several pre-trained models, including XLM-R, MuRILBERT, and BanglaBERT. XLM-R is a multilingual model trained on 100 languages, while MuRILBERT focuses on Indian languages and BanglaBERT is specifically designed for the Bengali language. The authors fine-tuned these models by adding a linear layer for classification tasks.

The strategy employed involves fine-tuning the pre-trained models, adopting a contrastive learning loss function, and performing model ensemble. Contrastive learning trains models to differentiate between dissimilar pairs of data while recognizing similarities among equivalent instances. The most commonly used loss function for contrastive learning is infoNCE.

Model ensemble combines multiple models to improve performance and stability by integrating their predictions and mitigating individual biases and variances. The authors used model ensemble to optimize accuracy in the TRAC-2024 competition, aggregating predictions from various fine-tuned models to capture the complexities of the multilingual dataset. This approach helped minimize overfitting and achieve commendable F1 scores.

Experiment

Dataset and Metrics: The paper treats sub-task a as a 4-class classification task and sub-task b as a multi-label 5-classification task. The training set is split into training and validation sets with a 4:1 ratio. The evaluation metric used is the F1 Score.

Implementation Details: The approach is based on fine-tuning pre-trained models like XLM-RoBERTa-base, XLM-RoBERTa-large, MuRILBERT, and BanglaBERT. The maximum number of text tokens used is 512. For sub-task b, the threshold η is set to 0.5.

Comparison Methods and Results: The paper focuses on sub-task a. The results show that more sophisticated models generally achieve higher F1 scores. IndicBERT and BanglaHateBERT perform less well compared to multilingual models. Both XLM-R base and large variants score an F1 of 0.70. The highest score of 0.73 is achieved by the Model Ensemble method.

Ablation Study: The study analyzes the contribution of contrastive loss and model ensemble strategies. Adding contrastive loss improves the F1 score. Among different ensemble strategies, the average ensemble method yields the highest results.

Conclusion

The paper acknowledges limitations in the ensemble model and contrastive learning approach used for the TRAC-2024 competition, despite their strategic implementation. The complexity of the multilingual dataset and subtle contextual nuances in social media comments required finer modeling granularity. The ensemble faced challenges with rare language constructs and cultural idioms, occasionally leading to misclassifications. The contrastive learning needed more sophisticated negative sampling strategies to fully capture the complex dynamics of potential offline harm in diverse cultural contexts. These limitations highlight areas for future research and refinement to develop a model with more nuanced understanding and predictive power.

Bibliographical References

The provided text appears to simply be "\c@NAT@ctr". This does not seem to be a section of a research paper that can be summarized. There is no meaningful content here to analyze or explain in plain language.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🖼️

NJUST-KMG at TRAC-2024 Tasks 1 and 2: Offline Harm Potential Identification

Jingyuan Wang, Shengdong Xu, Yang Yang

This report provide a detailed description of the method that we proposed in the TRAC-2024 Offline Harm Potential dentification which encloses two sub-tasks. The investigation utilized a rich dataset comprised of social media comments in several Indian languages, annotated with precision by expert judges to capture the nuanced implications for offline context harm. The objective assigned to the participants was to design algorithms capable of accurately assessing the likelihood of harm in given situations and identifying the most likely target(s) of offline harm. Our approach ranked second in two separate tracks, with F1 values of 0.73 and 0.96 respectively. Our method principally involved selecting pretrained models for finetuning, incorporating contrastive learning techniques, and culminating in an ensemble approach for the test set.

4/1/2024

Towards Generalized Offensive Language Identification

Alphaeus Dmonte, Tejas Arya, Tharindu Ranasinghe, Marcos Zampieri

The prevalence of offensive content on the internet, encompassing hate speech and cyberbullying, is a pervasive issue worldwide. Consequently, it has garnered significant attention from the machine learning (ML) and natural language processing (NLP) communities. As a result, numerous systems have been developed to automatically identify potentially harmful content and mitigate its impact. These systems can follow two approaches; (1) Use publicly available models and application endpoints, including prompting large language models (LLMs) (2) Annotate datasets and train ML models on them. However, both approaches lack an understanding of how generalizable they are. Furthermore, the applicability of these systems is often questioned in off-domain and practical environments. This paper empirically evaluates the generalizability of offensive language detection models and datasets across a novel generalized benchmark. We answer three research questions on generalizability. Our findings will be useful in creating robust real-world offensive language detection systems.

7/29/2024

Breaking the Silence Detecting and Mitigating Gendered Abuse in Hindi, Tamil, and Indian English Online Spaces

Advaitha Vetagiri, Gyandeep Kalita, Eisha Halder, Chetna Taparia, Partha Pakray, Riyanka Manna

Online gender-based harassment is a widespread issue limiting the free expression and participation of women and marginalized genders in digital spaces. Detecting such abusive content can enable platforms to curb this menace. We participated in the Gendered Abuse Detection in Indic Languages shared task at ICON2023 that provided datasets of annotated Twitter posts in English, Hindi and Tamil for building classifiers to identify gendered abuse. Our team CNLP-NITS-PP developed an ensemble approach combining CNN and BiLSTM networks that can effectively model semantic and sequential patterns in textual data. The CNN captures localized features indicative of abusive language through its convolution filters applied on embedded input text. To determine context-based offensiveness, the BiLSTM analyzes this sequence for dependencies among words and phrases. Multiple variations were trained using FastText and GloVe word embeddings for each language dataset comprising over 7,600 crowdsourced annotations across labels for explicit abuse, targeted minority attacks and general offences. The validation scores showed strong performance across f1-measures, especially for English 0.84. Our experiments reveal how customizing embeddings and model hyperparameters can improve detection capability. The proposed architecture ranked 1st in the competition, proving its ability to handle real-world noisy text with code-switching. This technique has a promising scope as platforms aim to combat cyber harassment facing Indic language internet users. Our Code is at https://github.com/advaithavetagiri/CNLP-NITS-PP

4/4/2024

🔎

Multilingual Models for Check-Worthy Social Media Posts Detection

Sebastian Kula, Michal Gregor

This work presents an extensive study of transformer-based NLP models for detection of social media posts that contain verifiable factual claims and harmful claims. The study covers various activities, including dataset collection, dataset pre-processing, architecture selection, setup of settings, model training (fine-tuning), model testing, and implementation. The study includes a comprehensive analysis of different models, with a special focus on multilingual models where the same model is capable of processing social media posts in both English and in low-resource languages such as Arabic, Bulgarian, Dutch, Polish, Czech, Slovak. The results obtained from the study were validated against state-of-the-art models, and the comparison demonstrated the robustness of the proposed models. The novelty of this work lies in the development of multi-label multilingual classification models that can simultaneously detect harmful posts and posts that contain verifiable factual claims in an efficient way.

8/14/2024