Text Detoxification as Style Transfer in English and Hindi

Read original: arXiv:2402.07767 - Published 6/11/2024 by Sourabrata Mukherjee, Akanksha Bansal, Atul Kr. Ojha, John P. McCrae, Ondv{r}ej Duv{s}ek

🔄

Overview

This paper focuses on "text detoxification" - the automatic conversion of toxic text into non-toxic text.
This task is considered a Text Style Transfer (TST) problem, where the text's style changes while the content is preserved.
The researchers present three approaches to text detoxification: knowledge transfer from a similar task, a multi-task learning approach, and a "delete and reconstruct" approach.
The researchers utilize a dataset provided by Dementieva et al. (2021) and introduce a small Hindi parallel dataset for evaluation purposes.
The results demonstrate that the researchers' approaches effectively balance text detoxification while preserving content and maintaining fluency.

Plain English Explanation

The paper focuses on a problem called "text detoxification," which is the process of automatically converting toxic or offensive text into non-toxic, respectful text. This is an important task for creating safer and more inclusive online communication.

The researchers tried three different approaches to solve this problem. The first approach involves taking what the researchers learned from working on a similar task and applying that knowledge to the text detoxification problem. The second approach combines different machine learning tasks, like detecting toxic text and generating non-toxic text, to tackle the problem. The third approach deletes the toxic parts of the text and then reconstructs the text in a non-toxic way.

To test their approaches, the researchers used a dataset of toxic text and the corresponding non-toxic versions, provided by another research team. They also created a smaller dataset of toxic and non-toxic text in Hindi, which they used to evaluate their methods.

The results show that the researchers' approaches were able to successfully remove the toxicity from the text while still keeping the original meaning and sounding natural. This is an important step towards making online communication more respectful and inclusive.

Technical Explanation

The researchers explore three main approaches to the text detoxification task:

Knowledge Transfer: The researchers leverage knowledge gained from a similar task, text style transfer, to inform their text detoxification models.
Multi-task Learning: The researchers combine sequence-to-sequence modeling for text generation with various toxicity classification tasks in a multi-task learning framework, as explored in prior work.
Delete and Reconstruct: The researchers take a "delete and reconstruct" approach, where they first remove the toxic parts of the text and then generate a new, non-toxic version of the text.

To support their research, the researchers utilize a dataset provided by Dementieva et al. (2021), which contains multiple versions of detoxified texts corresponding to toxic texts. The researchers then engage human annotators to select the best detoxified variants, creating a dataset where each toxic sentence is paired with a single, appropriate detoxified version.

Additionally, the researchers introduce a small Hindi parallel dataset, aligning with a part of the English dataset, to enable evaluation of their text detoxification approaches in a multilingual setting.

The experimental results demonstrate that the researchers' approaches effectively balance text detoxification while preserving the original content and maintaining fluency, as measured by human evaluation.

Critical Analysis

The paper presents a comprehensive approach to the important problem of text detoxification, which has significant implications for creating safer and more inclusive online communication. The researchers' use of both knowledge transfer and multi-task learning techniques is a promising direction, as it leverages insights from related tasks to inform the detoxification models.

However, the paper does not provide a detailed analysis of the limitations or potential biases in the datasets used. It would be valuable to understand how the selected "best" detoxified versions were determined and whether there are any systematic biases in the human annotations. Additionally, the small size of the Hindi parallel dataset may limit the generalizability of the findings to other languages.

Further research could explore the performance of the researchers' approaches on larger, more diverse datasets, as well as investigate the impact of different text detoxification techniques on downstream applications, such as text generation or social media moderation. Rigorous testing for unintended biases or harmful outputs would also be an important next step.

Conclusion

This paper presents a valuable contribution to the field of text detoxification, a crucial task for promoting safer and more respectful online communication. The researchers' exploration of knowledge transfer, multi-task learning, and a "delete and reconstruct" approach demonstrates the potential of these techniques to effectively remove toxicity from text while preserving the original content and fluency.

The introduction of a Hindi parallel dataset also opens the door for further investigations into the multilingual aspects of text detoxification. While the paper leaves room for further research into dataset quality, potential biases, and application-specific impacts, it represents an important step forward in addressing the pressing challenge of online toxicity.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🔄

Text Detoxification as Style Transfer in English and Hindi

Sourabrata Mukherjee, Akanksha Bansal, Atul Kr. Ojha, John P. McCrae, Ondv{r}ej Duv{s}ek

This paper focuses on text detoxification, i.e., automatically converting toxic text into non-toxic text. This task contributes to safer and more respectful online communication and can be considered a Text Style Transfer (TST) task, where the text style changes while its content is preserved. We present three approaches: knowledge transfer from a similar task, multi-task learning approach, combining sequence-to-sequence modeling with various toxicity classification tasks, and delete and reconstruct approach. To support our research, we utilize a dataset provided by Dementieva et al.(2021), which contains multiple versions of detoxified texts corresponding to toxic texts. In our experiments, we selected the best variants through expert human annotators, creating a dataset where each toxic sentence is paired with a single, appropriate detoxified version. Additionally, we introduced a small Hindi parallel dataset, aligning with a part of the English dataset, suitable for evaluation purposes. Our results demonstrate that our approach effectively balances text detoxication while preserving the actual content and maintaining fluency.

6/11/2024

MultiParaDetox: Extending Text Detoxification with Parallel Data to New Languages

Daryna Dementieva, Nikolay Babakov, Alexander Panchenko

Text detoxification is a textual style transfer (TST) task where a text is paraphrased from a toxic surface form, e.g. featuring rude words, to the neutral register. Recently, text detoxification methods found their applications in various task such as detoxification of Large Language Models (LLMs) (Leong et al., 2023; He et al., 2024; Tang et al., 2023) and toxic speech combating in social networks (Deng et al., 2023; Mun et al., 2023; Agarwal et al., 2023). All these applications are extremely important to ensure safe communication in modern digital worlds. However, the previous approaches for parallel text detoxification corpora collection -- ParaDetox (Logacheva et al., 2022) and APPADIA (Atwell et al., 2022) -- were explored only in monolingual setup. In this work, we aim to extend ParaDetox pipeline to multiple languages presenting MultiParaDetox to automate parallel detoxification corpus collection for potentially any language. Then, we experiment with different text detoxification models -- from unsupervised baselines to LLMs and fine-tuned models on the presented parallel corpora -- showing the great benefit of parallel corpus presence to obtain state-of-the-art text detoxification models for any language.

4/3/2024

💬

Are Large Language Models Actually Good at Text Style Transfer?

Sourabrata Mukherjee, Atul Kr. Ojha, Ondv{r}ej Duv{s}ek

We analyze the performance of large language models (LLMs) on Text Style Transfer (TST), specifically focusing on sentiment transfer and text detoxification across three languages: English, Hindi, and Bengali. Text Style Transfer involves modifying the linguistic style of a text while preserving its core content. We evaluate the capabilities of pre-trained LLMs using zero-shot and few-shot prompting as well as parameter-efficient finetuning on publicly available datasets. Our evaluation using automatic metrics, GPT-4 and human evaluations reveals that while some prompted LLMs perform well in English, their performance in on other languages (Hindi, Bengali) remains average. However, finetuning significantly improves results compared to zero-shot and few-shot prompting, making them comparable to previous state-of-the-art. This underscores the necessity of dedicated datasets and specialized models for effective TST.

8/28/2024

🤖

Multilingual Text Style Transfer: Datasets & Models for Indian Languages

Sourabrata Mukherjee, Atul Kr. Ojha, Akanksha Bansal, Deepak Alok, John P. McCrae, Ondv{r}ej Duv{s}ek

Text style transfer (TST) involves altering the linguistic style of a text while preserving its core content. This paper focuses on sentiment transfer, a popular TST subtask, across a spectrum of Indian languages: Hindi, Magahi, Malayalam, Marathi, Punjabi, Odia, Telugu, and Urdu, expanding upon previous work on English-Bangla sentiment transfer (Mukherjee et al., 2023). We introduce dedicated datasets of 1,000 positive and 1,000 negative style-parallel sentences for each of these eight languages. We then evaluate the performance of various benchmark models categorized into parallel, non-parallel, cross-lingual, and shared learning approaches, including the Llama2 and GPT-3.5 large language models (LLMs). Our experiments highlight the significance of parallel data in TST and demonstrate the effectiveness of the Masked Style Filling (MSF) approach (Mukherjee et al., 2023) in non-parallel techniques. Moreover, cross-lingual and joint multilingual learning methods show promise, offering insights into selecting optimal models tailored to the specific language and task requirements. To the best of our knowledge, this work represents the first comprehensive exploration of the TST task as sentiment transfer across a diverse set of languages.

8/28/2024