MuTox: Universal MUltilingual Audio-based TOXicity Dataset and Zero-shot Detector

Read original: arXiv:2401.05060 - Published 6/28/2024 by Marta R. Costa-juss`a, Mariano Coria Meglioli, Pierre Andrews, David Dale, Prangthip Hansanti, Elahe Kalbassi, Alex Mourachko, Christophe Ropers, Carleigh Wood

MuTox: Universal MUltilingual Audio-based TOXicity Dataset and Zero-shot Detector

Overview

Presents a new multilingual audio-based toxicity detection dataset called MuTox
Introduces a zero-shot toxicity detection model that can analyze speech in multiple languages
Demonstrates the model's ability to effectively identify toxic content in audio, even for languages it wasn't trained on

Plain English Explanation

The paper introduces a new dataset called MuTox that contains audio recordings of toxic and non-toxic speech in multiple languages. Building on this dataset, the researchers developed a toxicity detection model that can analyze audio in various languages without being explicitly trained on those languages.

This "zero-shot" approach means the model can identify toxic content in speech, even for languages it hasn't seen before. This is an important advancement, as prior toxicity detection systems were often limited to a single language or required retraining for new languages.

The key innovation here is the ability to leverage a multilingual dataset to train a toxicity model that generalizes across languages. This allows the model to be deployed more broadly and handle the linguistic diversity found in many real-world applications, like online forums or video sharing platforms.

Technical Explanation

The paper first provides background on text-based toxicity classifiers and the limitations of existing multilingual voice toxicity detection systems. It then introduces the MuTox dataset, which contains audio recordings of toxic and non-toxic speech in 7 languages.

Building on this dataset, the researchers developed a zero-shot toxicity detection model that can analyze audio in any language, even those not seen during training. The model uses a multilingual speech representation and a toxicity classifier that is trained in a language-agnostic way.

Experiments demonstrate the model's strong performance on evaluating toxicity in multilingual audio, outperforming prior approaches that required explicit language-specific training. The model also shows promising results on detecting toxicity in Ukrainian audio, a language not present in the training data.

Critical Analysis

The paper presents a novel and promising approach to multilingual toxicity detection in audio. The use of a zero-shot model is a key strength, as it avoids the need for manual annotation and retraining when expanding to new languages.

However, the dataset is still limited in its language coverage, focusing primarily on high-resource languages. Expanding the MuTox dataset to include more diverse, low-resource languages would further demonstrate the model's broad applicability.

Additionally, the paper does not provide a detailed analysis of the types of toxic content the model is able to detect. Understanding the model's capabilities and limitations in identifying different forms of toxicity, such as hate speech, profanity, or threats, would be valuable for real-world deployment.

Conclusion

This work introduces a significant advancement in multilingual toxicity detection by developing a zero-shot model that can effectively analyze toxic content in audio across a range of languages. The MuTox dataset and the proposed model represent an important step towards building more inclusive and robust systems for identifying and mitigating online toxicity at scale.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

MuTox: Universal MUltilingual Audio-based TOXicity Dataset and Zero-shot Detector

Marta R. Costa-juss`a, Mariano Coria Meglioli, Pierre Andrews, David Dale, Prangthip Hansanti, Elahe Kalbassi, Alex Mourachko, Christophe Ropers, Carleigh Wood

Research in toxicity detection in natural language processing for the speech modality (audio-based) is quite limited, particularly for languages other than English. To address these limitations and lay the groundwork for truly multilingual audio-based toxicity detection, we introduce MuTox, the first highly multilingual audio-based dataset with toxicity labels. The dataset comprises 20,000 audio utterances for English and Spanish, and 4,000 for the other 19 languages. To demonstrate the quality of this dataset, we trained the MuTox audio-based toxicity classifier, which enables zero-shot toxicity detection across a wide range of languages. This classifier outperforms existing text-based trainable classifiers by more than 1% AUC, while expanding the language coverage more than tenfold. When compared to a wordlist-based classifier that covers a similar number of languages, MuTox improves precision and recall by approximately 2.5 times. This significant improvement underscores the potential of MuTox in advancing the field of audio-based toxicity detection.

6/28/2024

From One to Many: Expanding the Scope of Toxicity Mitigation in Language Models

Luiza Pozzobon, Patrick Lewis, Sara Hooker, Beyza Ermis

To date, toxicity mitigation in language models has almost entirely been focused on single-language settings. As language models embrace multilingual capabilities, it's crucial our safety measures keep pace. Recognizing this research gap, our approach expands the scope of conventional toxicity mitigation to address the complexities presented by multiple languages. In the absence of sufficient annotated datasets across languages, we employ translated data to evaluate and enhance our mitigation techniques. We also compare finetuning mitigation approaches against retrieval-augmented techniques under both static and continual toxicity mitigation scenarios. This allows us to examine the effects of translation quality and the cross-lingual transfer on toxicity mitigation. We also explore how model size and data quantity affect the success of these mitigation efforts. Covering nine languages, our study represents a broad array of linguistic families and levels of resource availability, ranging from high to mid-resource languages. Through comprehensive experiments, we provide insights into the complexities of multilingual toxicity mitigation, offering valuable insights and paving the way for future research in this increasingly important field. Code and data are available at https://github.com/for-ai/goodtriever.

5/31/2024

Enhancing Multilingual Voice Toxicity Detection with Speech-Text Alignment

Joseph Liu, Mahesh Kumar Nandwana, Janne Pylkkonen, Hannes Heikinheimo, Morgan McGuire

Toxicity classification for voice heavily relies on the semantic content of speech. We propose a novel framework that utilizes cross-modal learning to integrate the semantic embedding of text into a multilabel speech toxicity classifier during training. This enables us to incorporate textual information during training while still requiring only audio during inference. We evaluate this classifier on large-scale datasets with real-world characteristics to validate the effectiveness of this framework. Through ablation studies, we demonstrate that general-purpose semantic text embeddings are rich and aligned with speech for toxicity classification purposes. Conducting experiments across multiple languages at scale, we show improvements in voice toxicity classification across five languages and different toxicity categories.

6/18/2024

ToxVidLLM: A Multimodal LLM-based Framework for Toxicity Detection in Code-Mixed Videos

Krishanu Maity, A. S. Poornash, Sriparna Saha, Pushpak Bhattacharyya

In an era of rapidly evolving internet technology, the surge in multimodal content, including videos, has expanded the horizons of online communication. However, the detection of toxic content in this diverse landscape, particularly in low-resource code-mixed languages, remains a critical challenge. While substantial research has addressed toxic content detection in textual data, the realm of video content, especially in non-English languages, has been relatively underexplored. This paper addresses this research gap by introducing a benchmark dataset, the first of its kind, consisting of 931 videos with 4021 code-mixed Hindi-English utterances collected from YouTube. Each utterance within this dataset has been meticulously annotated for toxicity, severity, and sentiment labels. We have developed an advanced Multimodal Multitask framework built for Toxicity detection in Video Content by leveraging Language Models (LMs), crafted for the primary objective along with the additional tasks of conducting sentiment and severity analysis. ToxVidLM incorporates three key modules - the Encoder module, Cross-Modal Synchronization module, and Multitask module - crafting a generic multimodal LM customized for intricate video classification tasks. Our experiments reveal that incorporating multiple modalities from the videos substantially enhances the performance of toxic content detection by achieving an Accuracy and Weighted F1 score of 94.29% and 94.35%, respectively.

7/16/2024