Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words

Read original: arXiv:2407.16266 - Published 7/24/2024 by Yijie Chen, Yijin Liu, Fandong Meng, Jinan Xu, Yufeng Chen, Jie Zhou
Total Score

0

Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words

Sign in to get full access

or

If you already have an account, we'll log you in

Overview

  • Evaluates gender-inclusive machine translation with ambiguous attitude words
  • Examines how machine translation systems handle the translation of words with ambiguous gender connotations
  • Assesses the performance of machine translation models in generating gender-inclusive translations

Plain English Explanation

This paper explores how machine translation systems handle the translation of words that have ambiguous gender connotations. Ambiguous attitude words are terms that can be associated with different genders depending on the context.

The researchers investigate the performance of machine translation models in generating gender-inclusive translations. They aim to understand how well these models can capture the nuances of gender-ambiguous words and produce translations that are inclusive of diverse gender identities. This is important for ensuring machine translation tools are fair and equitable.

By evaluating the translations generated by different machine learning models, the researchers can identify areas for improvement and provide insights to help develop more gender-inclusive natural language processing systems.

Technical Explanation

The paper presents a study that evaluates the gender-inclusiveness of machine translation outputs when translating ambiguous attitude words. The researchers curated a dataset of gender-ambiguous words in English and their translations in multiple target languages.

They then fed these words into several machine translation models and analyzed the resulting translations. The analysis focused on whether the translations preserved the gender ambiguity of the original words or introduced gender-specific forms.

The findings suggest that current machine translation systems struggle to maintain gender-inclusiveness when translating ambiguous attitude words. The models often defaulted to binary gender assignments, failing to capture the nuanced meanings conveyed by the original ambiguous terms.

The paper discusses strategies for enhancing gender-inclusive machine translation, such as incorporating gender-neutral language models and leveraging large language models to better recognize and handle gender-ambiguous expressions.

Critical Analysis

The paper highlights an important limitation of existing machine translation systems - their inability to reliably handle gender-ambiguous words and generate inclusive translations. This is a significant shortcoming, as language models should be able to capture the full spectrum of gender identities and avoid reinforcing binary gender stereotypes.

While the research provides valuable insights, it also raises questions about the broader challenges in achieving gender-fairness in natural language processing. The study focuses on a specific type of ambiguous words, but there may be other linguistic constructs that pose similar challenges for machine translation.

Furthermore, the paper does not delve into the potential societal implications of biased machine translations. It would be valuable to explore how these issues might impact marginalized communities and how to mitigate potential harms.

Conclusion

This paper makes an important contribution by revealing the limitations of current machine translation models in handling gender-ambiguous words. The findings underscore the need for more advanced natural language processing techniques that can accurately capture the nuances of gender identity and generate truly inclusive translations.

As language models become increasingly prominent in various applications, addressing these biases and ensuring equitable representation is crucial. The insights from this research can inform the development of more inclusive and socially responsible machine translation systems, ultimately promoting greater diversity and inclusivity in language technology.



This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words
Total Score

0

Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words

Yijie Chen, Yijin Liu, Fandong Meng, Jinan Xu, Yufeng Chen, Jie Zhou

Gender bias has been a focal point in the study of bias in machine translation and language models. Existing machine translation gender bias evaluations are primarily focused on male and female genders, limiting the scope of the evaluation. To assess gender bias accurately, these studies often rely on calculating the accuracy of gender pronouns or the masculine and feminine attributes of grammatical gender via the stereotypes triggered by occupations or sentiment words ({em i.e.}, clear positive or negative attitude), which cannot extend to non-binary groups. This study presents a benchmark AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words), which assesses gender bias beyond binary gender. Meanwhile, we propose a novel process to evaluate gender bias based on the Emotional Attitude Score (EAS), which is used to quantify ambiguous attitude words. In evaluating three recent and effective open-source LLMs and one powerful multilingual translation-specific model, our main observations are: (1) The translation performance within non-binary gender contexts is markedly inferior in terms of translation quality and exhibits more negative attitudes than binary-gender contexts. (2) The analysis experiments indicate that incorporating constraint context in prompts for gender identity terms can substantially reduce translation bias, while the bias remains evident despite the presence of the constraints. The code is publicly available at url{https://github.com/pppa2019/ambGIMT}.

Read more

7/24/2024

Generating Gender Alternatives in Machine Translation
Total Score

0

Generating Gender Alternatives in Machine Translation

Sarthak Garg, Mozhdeh Gheini, Clara Emmanuel, Tatiana Likhomanenko, Qin Gao, Matthias Paulik

Machine translation (MT) systems often translate terms with ambiguous gender (e.g., English term the nurse) into the gendered form that is most prevalent in the systems' training data (e.g., enfermera, the Spanish term for a female nurse). This often reflects and perpetuates harmful stereotypes present in society. With MT user interfaces in mind that allow for resolving gender ambiguity in a frictionless manner, we study the problem of generating all grammatically correct gendered translation alternatives. We open source train and test datasets for five language pairs and establish benchmarks for this task. Our key technical contribution is a novel semi-supervised solution for generating alternatives that integrates seamlessly with standard MT models and maintains high performance without requiring additional components or increasing inference overhead.

Read more

7/31/2024

The power of Prompts: Evaluating and Mitigating Gender Bias in MT with LLMs
Total Score

0

The power of Prompts: Evaluating and Mitigating Gender Bias in MT with LLMs

Aleix Sant, Carlos Escolano, Audrey Mash, Francesca De Luca Fornaciari, Maite Melero

This paper studies gender bias in machine translation through the lens of Large Language Models (LLMs). Four widely-used test sets are employed to benchmark various base LLMs, comparing their translation quality and gender bias against state-of-the-art Neural Machine Translation (NMT) models for English to Catalan (En $rightarrow$ Ca) and English to Spanish (En $rightarrow$ Es) translation directions. Our findings reveal pervasive gender bias across all models, with base LLMs exhibiting a higher degree of bias compared to NMT models. To combat this bias, we explore prompting engineering techniques applied to an instruction-tuned LLM. We identify a prompt structure that significantly reduces gender bias by up to 12% on the WinoMT evaluation dataset compared to more straightforward prompts. These results significantly reduce the gender bias accuracy gap between LLMs and traditional NMT systems.

Read more

7/29/2024

Investigating Markers and Drivers of Gender Bias in Machine Translations
Total Score

0

Investigating Markers and Drivers of Gender Bias in Machine Translations

Peter J Barclay (Edinburgh Napier University), Ashkan Sami (Edinburgh Napier University)

Implicit gender bias in Large Language Models (LLMs) is a well-documented problem, and implications of gender introduced into automatic translations can perpetuate real-world biases. However, some LLMs use heuristics or post-processing to mask such bias, making investigation difficult. Here, we examine bias in LLMss via back-translation, using the DeepL translation API to investigate the bias evinced when repeatedly translating a set of 56 Software Engineering tasks used in a previous study. Each statement starts with 'she', and is translated first into a 'genderless' intermediate language then back into English; we then examine pronoun-choice in the back-translated texts. We expand prior research in the following ways: (1) by comparing results across five intermediate languages, namely Finnish, Indonesian, Estonian, Turkish and Hungarian; (2) by proposing a novel metric for assessing the variation in gender implied in the repeated translations, avoiding the over-interpretation of individual pronouns, apparent in earlier work; (3) by investigating sentence features that drive bias; (4) and by comparing results from three time-lapsed datasets to establish the reproducibility of the approach. We found that some languages display similar patterns of pronoun use, falling into three loose groups, but that patterns vary between groups; this underlines the need to work with multiple languages. We also identify the main verb appearing in a sentence as a likely significant driver of implied gender in the translations. Moreover, we see a good level of replicability in the results, and establish that our variation metric proves robust despite an obvious change in the behaviour of the DeepL translation API during the course of the study. These results show that the back-translation method can provide further insights into bias in language models.

Read more

4/3/2024