Survey of Bias In Text-to-Image Generation: Definition, Evaluation, and Mitigation






Published 5/3/2024 by Yixin Wan, Arjun Subramonian, Anaelia Ovalle, Zongyu Lin, Ashima Suvarna, Christina Chance, Hritik Bansal, Rebecca Pattichis, Kai-Wei Chang



The recent advancement of large and powerful models with Text-to-Image (T2I) generation abilities -- such as OpenAI's DALLE-3 and Google's Gemini -- enables users to generate high-quality images from textual prompts. However, it has become increasingly evident that even simple prompts could cause T2I models to exhibit conspicuous social bias in generated images. Such bias might lead to both allocational and representational harms in society, further marginalizing minority groups. Noting this problem, a large body of recent works has been dedicated to investigating different dimensions of bias in T2I systems. However, an extensive review of these studies is lacking, hindering a systematic understanding of current progress and research gaps. We present the first extensive survey on bias in T2I generative models. In this survey, we review prior studies on dimensions of bias: Gender, Skintone, and Geo-Culture. Specifically, we discuss how these works define, evaluate, and mitigate different aspects of bias. We found that: (1) while gender and skintone biases are widely studied, geo-cultural bias remains under-explored; (2) most works on gender and skintone bias investigated occupational association, while other aspects are less frequently studied; (3) almost all gender bias works overlook non-binary identities in their studies; (4) evaluation datasets and metrics are scattered, with no unified framework for measuring biases; and (5) current mitigation methods fail to resolve biases comprehensively. Based on current limitations, we point out future research directions that contribute to human-centric definitions, evaluations, and mitigation of biases. We hope to highlight the importance of studying biases in T2I systems, as well as encourage future efforts to holistically understand and tackle biases, building fair and trustworthy T2I technologies for everyone.

Create account to get full access


If you already have an account, we'll log you in


  • The paper provides a comprehensive survey of biases in text-to-image (T2I) generation models, including definitions, evaluation methods, and mitigation approaches.
  • It focuses on gender bias, which is a significant concern in these AI systems.
  • The paper also discusses other types of biases, such as racial and cultural biases, and the challenges in addressing them.

Plain English Explanation

The paper examines the issue of bias in text-to-image (T2I) generation models, which are AI systems that can create images based on textual descriptions. These models are becoming increasingly powerful and widespread, but they can also reflect and amplify societal biases.

The researchers define different types of bias, such as gender bias, where the model may generate stereotypical or unrepresentative images based on certain gender-related prompts. For example, if asked to generate an image of a "CEO," the model might produce a picture of a man, even though many CEOs are women.

The paper also discusses methods for evaluating and measuring these biases, as well as approaches for mitigating them. This includes techniques like debiasing the training data, adjusting the model architecture, and using specialized loss functions during training.

Understanding and addressing bias in T2I models is critical, as these systems are being used in various applications, from art generation to product design. Unchecked biases can lead to the perpetuation of harmful stereotypes and the exclusion of underrepresented groups.

Technical Explanation

The paper starts by defining different types of bias in T2I models, with a focus on gender bias. The researchers identify several ways in which gender bias can manifest, such as in the visual representation of individuals, the attributes associated with them, and the relative frequency of different genders in the generated images.

To evaluate these biases, the paper presents various metrics and benchmarks, such as measuring the gender distribution of generated images, assessing the correlation between gender and perceived attributes, and comparing the representation of different genders across various domains (e.g., professions, activities).

The researchers then review a range of mitigation strategies, including data augmentation techniques, model architecture modifications, and specialized training objectives. These approaches aim to reduce the biases observed in the generated images while maintaining the overall performance of the T2I models.

The paper also discusses the challenges in addressing biases, particularly when they are deeply rooted in the training data or the underlying language models used in the T2I systems. The researchers highlight the need for holistic and interdisciplinary approaches to tackle this complex problem.

Critical Analysis

The paper provides a thorough and well-structured review of bias in T2I models, which is a pressing issue as these systems become more widely adopted. The researchers have done an excellent job of defining the problem, outlining evaluation methods, and exploring mitigation strategies.

One potential limitation of the paper is that it primarily focuses on gender bias, while other forms of bias, such as racial and cultural biases, are only briefly mentioned. Given the multifaceted nature of bias, a more in-depth discussion of these other types of biases and their unique challenges would have been valuable.

Additionally, the paper acknowledges the difficulty of addressing deeply-rooted biases in the training data and underlying language models. This raises questions about the long-term feasibility of the proposed mitigation strategies and the need for more fundamental changes in the way these models are developed and deployed.

Overall, the paper is a valuable contribution to the field, highlighting the importance of understanding and addressing bias in T2I models. It serves as a solid foundation for further research and development in this critical area.


The paper provides a comprehensive survey of bias in text-to-image generation models, focusing on the issue of gender bias. It defines different types of bias, outlines evaluation methods, and explores a range of mitigation strategies.

Understanding and addressing bias in T2I models is crucial as these systems become increasingly prevalent in various applications. Unchecked biases can lead to the perpetuation of harmful stereotypes and the exclusion of underrepresented groups.

The paper's thorough examination of the problem and the proposed solutions offer valuable insights for researchers, developers, and end-users of T2I systems. While the challenges in addressing bias are significant, this work represents an important step towards more equitable and inclusive AI-generated content.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Severity Controlled Text-to-Image Generative Model Bias Manipulation

Severity Controlled Text-to-Image Generative Model Bias Manipulation

Jordan Vice, Naveed Akhtar, Richard Hartley, Ajmal Mian





Text-to-image (T2I) generative models are gaining wide popularity, especially in public domains. However, their intrinsic bias and potential malicious manipulations remain under-explored. Charting the susceptibility of T2I models to such manipulation, we first expose the new possibility of a dynamic and computationally efficient exploitation of model bias by targeting the embedded language models. By leveraging mathematical foundations of vector algebra, our technique enables a scalable and convenient control over the severity of output manipulation through model bias. As a by-product, this control also allows a form of precise prompt engineering to generate images which are generally implausible with regular text prompts. We also demonstrate a constructive application of our manipulation for balancing the frequency of generated classes - as in model debiasing. Our technique does not require training and is also framed as a backdoor attack with severity control using semantically-null text triggers in the prompts. With extensive analysis, we present interesting qualitative and quantitative results to expose potential manipulation possibilities for T2I models. Key-words: Text-to-Image Models, Generative Models, Backdoor Attacks, Prompt Engineering, Bias

Read more


Can Prompt Modifiers Control Bias? A Comparative Analysis of Text-to-Image Generative Models

Can Prompt Modifiers Control Bias? A Comparative Analysis of Text-to-Image Generative Models

Philip Wootaek Shin, Jihyun Janice Ahn, Wenpeng Yin, Jack Sampson, Vijaykrishnan Narayanan





It has been shown that many generative models inherit and amplify societal biases. To date, there is no uniform/systematic agreed standard to control/adjust for these biases. This study examines the presence and manipulation of societal biases in leading text-to-image models: Stable Diffusion, DALL-E 3, and Adobe Firefly. Through a comprehensive analysis combining base prompts with modifiers and their sequencing, we uncover the nuanced ways these AI technologies encode biases across gender, race, geography, and region/culture. Our findings reveal the challenges and potential of prompt engineering in controlling biases, highlighting the critical need for ethical AI development promoting diversity and inclusivity. This work advances AI ethics by not only revealing the nuanced dynamics of bias in text-to-image generation models but also by offering a novel framework for future research in controlling bias. Our contributions-panning comparative analyses, the strategic use of prompt modifiers, the exploration of prompt sequencing effects, and the introduction of a bias sensitivity taxonomy-lay the groundwork for the development of common metrics and standard analyses for evaluating whether and how future AI models exhibit and respond to requests to adjust for inherent biases.

Read more



Multilingual Text-to-Image Generation Magnifies Gender Stereotypes and Prompt Engineering May Not Help You

Felix Friedrich, Katharina Hammerl, Patrick Schramowski, Manuel Brack, Jindrich Libovicky, Kristian Kersting, Alexander Fraser





Text-to-image generation models have recently achieved astonishing results in image quality, flexibility, and text alignment, and are consequently employed in a fast-growing number of applications. Through improvements in multilingual abilities, a larger community now has access to this technology. However, our results show that multilingual models suffer from significant gender biases just as monolingual models do. Furthermore, the natural expectation that multilingual models will provide similar results across languages does not hold up. Instead, there are important differences between languages. We propose a novel benchmark, MAGBIG, intended to foster research on gender bias in multilingual models. We use MAGBIG to investigate the effect of multilingualism on gender bias in T2I models. To this end, we construct multilingual prompts requesting portraits of people with a certain occupation or trait. Our results show that not only do models exhibit strong gender biases but they also behave differently across languages. Furthermore, we investigate prompt engineering strategies, such as indirect, neutral formulations, to mitigate these biases. Unfortunately, these approaches have limited success and result in worse text-to-image alignment. Consequently, we call for more research into diverse representations across languages in image generators, as well as into steerability to address biased model behavior.

Read more



Not Just Pretty Pictures: Toward Interventional Data Augmentation Using Text-to-Image Generators

Jianhao Yuan, Francesco Pinto, Adam Davies, Philip Torr





Neural image classifiers are known to undergo severe performance degradation when exposed to inputs that are sampled from environmental conditions that differ from their training data. Given the recent progress in Text-to-Image (T2I) generation, a natural question is how modern T2I generators can be used to simulate arbitrary interventions over such environmental factors in order to augment training data and improve the robustness of downstream classifiers. We experiment across a diverse collection of benchmarks in single domain generalization (SDG) and reducing reliance on spurious features (RRSF), ablating across key dimensions of T2I generation, including interventional prompting strategies, conditioning mechanisms, and post-hoc filtering. Our extensive empirical findings demonstrate that modern T2I generators like Stable Diffusion can indeed be used as a powerful interventional data augmentation mechanism, outperforming previously state-of-the-art data augmentation techniques regardless of how each dimension is configured.

Read more
