MultiParaDetox: Extending Text Detoxification with Parallel Data to New Languages

Read original: arXiv:2404.02037 - Published 4/3/2024 by Daryna Dementieva, Nikolay Babakov, Alexander Panchenko

MultiParaDetox: Extending Text Detoxification with Parallel Data to New Languages

Overview

The paper describes a method called "MultiParaDetox" for extending text detoxification to new languages using parallel data.
Text detoxification is the process of removing toxic or abusive language from text.
The proposed method leverages parallel data, which consists of toxic and non-toxic versions of the same text, to train models that can detoxify text in multiple languages.
The researchers evaluate their approach on several languages and find it outperforms existing text detoxification models.

Plain English Explanation

The paper presents a way to clean up rude or harmful language in text, but for more than just one language. The key idea is to use "parallel data" - that means having both the original toxic text and a cleaned-up version of the same text. By training models on this parallel data, they can learn how to transform toxic text into polite, non-offensive language.

This is useful because text detoxification is an important task for making online platforms and communications more inclusive and respectful. But existing methods have typically only worked for a single language. The MultiParaDetox approach allows expanding text detoxification to new languages by leveraging the parallel data.

The researchers show that their method outperforms previous detoxification models when applied to multiple languages. This suggests it could be a valuable tool for making a wide range of online content and conversations more civil and welcoming, regardless of the language used.

Technical Explanation

The paper introduces MultiParaDetox, a text detoxification approach that can be applied to new languages by using parallel data. Parallel data consists of toxic text paired with its cleaned-up counterpart.

The core idea is to train models on this parallel data to learn how to transform toxic language into non-toxic alternatives. The authors experiment with different model architectures, including seq2seq and retrieval-based approaches. They evaluate the performance of MultiParaDetox on several languages, including English, German, and Italian.

The results show that MultiParaDetox outperforms existing text detoxification methods, even when applied to languages beyond the initial training set. This suggests the parallel data-driven approach is an effective way to extend detoxification capabilities to new languages.

Critical Analysis

The paper provides a thorough evaluation of the MultiParaDetox approach, testing it on multiple languages and comparing to prior work. The results demonstrate the value of leveraging parallel data for expanding text detoxification to new settings.

However, the paper does not deeply explore the limitations of the approach. For example, it is unclear how well MultiParaDetox would perform on more low-resource languages where parallel data may be scarce. The authors also do not discuss potential biases or social implications of the detoxification models.

Additionally, the paper focuses solely on the technical aspects of the method, without much discussion of the broader societal impacts of text detoxification. A more extensive consideration of the ethical considerations and real-world applications of this technology could strengthen the work.

Overall, the MultiParaDetox method represents a promising step forward in making text detoxification more scalable across languages. But further research is needed to fully understand the capabilities and limitations of this approach.

Conclusion

The MultiParaDetox paper introduces an effective way to extend text detoxification to new languages by leveraging parallel data. This allows the benefits of toxic language removal to be applied more widely, supporting efforts to create more inclusive and respectful online spaces.

While the technical evaluation is thorough, the paper could be strengthened by a deeper consideration of the social and ethical implications of this technology. Nonetheless, the core contribution - demonstrating how parallel data can enable cross-lingual text detoxification - represents an important advance in this crucial area of research.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

MultiParaDetox: Extending Text Detoxification with Parallel Data to New Languages

Daryna Dementieva, Nikolay Babakov, Alexander Panchenko

Text detoxification is a textual style transfer (TST) task where a text is paraphrased from a toxic surface form, e.g. featuring rude words, to the neutral register. Recently, text detoxification methods found their applications in various task such as detoxification of Large Language Models (LLMs) (Leong et al., 2023; He et al., 2024; Tang et al., 2023) and toxic speech combating in social networks (Deng et al., 2023; Mun et al., 2023; Agarwal et al., 2023). All these applications are extremely important to ensure safe communication in modern digital worlds. However, the previous approaches for parallel text detoxification corpora collection -- ParaDetox (Logacheva et al., 2022) and APPADIA (Atwell et al., 2022) -- were explored only in monolingual setup. In this work, we aim to extend ParaDetox pipeline to multiple languages presenting MultiParaDetox to automate parallel detoxification corpus collection for potentially any language. Then, we experiment with different text detoxification models -- from unsupervised baselines to LLMs and fine-tuned models on the presented parallel corpora -- showing the great benefit of parallel corpus presence to obtain state-of-the-art text detoxification models for any language.

4/3/2024

🔄

Text Detoxification as Style Transfer in English and Hindi

Sourabrata Mukherjee, Akanksha Bansal, Atul Kr. Ojha, John P. McCrae, Ondv{r}ej Duv{s}ek

This paper focuses on text detoxification, i.e., automatically converting toxic text into non-toxic text. This task contributes to safer and more respectful online communication and can be considered a Text Style Transfer (TST) task, where the text style changes while its content is preserved. We present three approaches: knowledge transfer from a similar task, multi-task learning approach, combining sequence-to-sequence modeling with various toxicity classification tasks, and delete and reconstruct approach. To support our research, we utilize a dataset provided by Dementieva et al.(2021), which contains multiple versions of detoxified texts corresponding to toxic texts. In our experiments, we selected the best variants through expert human annotators, creating a dataset where each toxic sentence is paired with a single, appropriate detoxified version. Additionally, we introduced a small Hindi parallel dataset, aligning with a part of the English dataset, suitable for evaluation purposes. Our results demonstrate that our approach effectively balances text detoxication while preserving the actual content and maintaining fluency.

6/11/2024

SmurfCat at PAN 2024 TextDetox: Alignment of Multilingual Transformers for Text Detoxification

Elisei Rykov, Konstantin Zaytsev, Ivan Anisimov, Alexandr Voronin

This paper presents a solution for the Multilingual Text Detoxification task in the PAN-2024 competition of the SmurfCat team. Using data augmentation through machine translation and a special filtering procedure, we collected an additional multilingual parallel dataset for text detoxification. Using the obtained data, we fine-tuned several multilingual sequence-to-sequence models, such as mT0 and Aya, on a text detoxification task. We applied the ORPO alignment technique to the final model. Our final model has only 3.7 billion parameters and achieves state-of-the-art results for the Ukrainian language and near state-of-the-art results for other languages. In the competition, our team achieved first place in the automated evaluation with a score of 0.52 and second place in the final human evaluation with a score of 0.74.

7/11/2024

🛸

GPT-DETOX: An In-Context Learning-Based Paraphraser for Text Detoxification

Ali Pesaranghader, Nikhil Verma, Manasa Bharadwaj

Harmful and offensive communication or content is detrimental to social bonding and the mental state of users on social media platforms. Text detoxification is a crucial task in natural language processing (NLP), where the goal is removing profanity and toxicity from text while preserving its content. Supervised and unsupervised learning are common approaches for designing text detoxification solutions. However, these methods necessitate fine-tuning, leading to computational overhead. In this paper, we propose GPT-DETOX as a framework for prompt-based in-context learning for text detoxification using GPT-3.5 Turbo. We utilize zero-shot and few-shot prompting techniques for detoxifying input sentences. To generate few-shot prompts, we propose two methods: word-matching example selection (WMES) and context-matching example selection (CMES). We additionally take into account ensemble in-context learning (EICL) where the ensemble is shaped by base prompts from zero-shot and all few-shot settings. We use ParaDetox and APPDIA as benchmark detoxification datasets. Our experimental results show that the zero-shot solution achieves promising performance, while our best few-shot setting outperforms the state-of-the-art models on ParaDetox and shows comparable results on APPDIA. Our EICL solutions obtain the greatest performance, adding at least 10% improvement, against both datasets.

4/5/2024