Formal Specification, Assessment, and Enforcement of Fairness for Generative AIs






Published 5/7/2024 by Chih-Hong Cheng, Changshun Wu, Harald Ruess, Xingyu Zhao, Saddek Bensalem



Reinforcing or even exacerbating societal biases and inequalities will increase significantly as generative AI increasingly produces useful artifacts, from text to images and beyond, for the real world. We address these issues by formally characterizing the notion of fairness for generative AI as a basis for monitoring and enforcing fairness. We define two levels of fairness using the notion of infinite sequences of abstractions of AI-generated artifacts such as text or images. The first is the fairness demonstrated on the generated sequences, which is evaluated only on the outputs while agnostic to the prompts and models used. The second is the inherent fairness of the generative AI model, which requires that fairness be manifested when input prompts are neutral, that is, they do not explicitly instruct the generative AI to produce a particular type of output. We also study relative intersectional fairness to counteract the combinatorial explosion of fairness when considering multiple categories together with lazy fairness enforcement. Finally, fairness monitoring and enforcement are tested against some current generative AI models.

Create account to get full access


If you already have an account, we'll log you in


  • The paper focuses on the risk of reinforcing societal biases and inequalities as generative AI systems produce content that increasingly resembles human output.
  • It formally defines the concept of fairness for generative AI, proposing two levels of fairness: fairness in the generated sequences and inherent fairness of the generative AI model.
  • The paper also explores relative intersectional fairness and lazy fairness enforcement to address the complexities of fairness across multiple categories.
  • The authors implement a specification monitoring and enforcement tool to test the fairness of several generative AI models.

Plain English Explanation

As generative AI models become more advanced, they are producing content that is difficult to distinguish from human-created output. This raises concerns that these AI systems could inadvertently reinforce or worsen existing biases and inequalities in society.

To address this issue, the researchers in this paper have developed a formal framework for defining fairness in the context of generative AI. They propose two levels of fairness:

  1. Fairness in the generated sequences: This looks at the fairness of the actual output produced by the AI, regardless of the prompts or models used.
  2. Inherent fairness of the generative AI model: This requires that the AI's fairness be manifested even when the input prompts are neutral, without explicitly instructing the AI to produce a particular type of output.

The paper also explores relative intersectional fairness, which considers fairness across multiple categories (e.g., race, gender, age) simultaneously, and lazy fairness enforcement, which aims to address the complexity of maintaining fairness across these multiple dimensions.

The researchers have developed a tool to monitor and enforce these fairness specifications when testing various generative AI models. This work is an important step in ensuring that the development of AI systems aligns with diverse human values and helps mitigate the risk of perpetuating or amplifying societal biases.

Technical Explanation

The paper proposes a formal characterization of fairness for generative AI systems, which are becoming increasingly advanced in their ability to produce human-like content. The researchers define two levels of fairness:

  1. Fairness in the generated sequences: This level of fairness is evaluated solely on the outputs of the generative AI, without considering the prompts or models used to produce them. The goal is to ensure that the generated content does not exhibit unfair biases or inequalities.

  2. Inherent fairness of the generative AI model: This level of fairness requires that the AI's fairness be manifested even when the input prompts are neutral, without explicitly instructing the AI to produce a particular type of output. This aims to ensure that the AI's underlying fairness is not dependent on the prompts used.

The paper also introduces the concept of relative intersectional fairness, which considers fairness across multiple categories (e.g., race, gender, age) simultaneously, and lazy fairness enforcement, which addresses the complexity of maintaining fairness across these multiple dimensions.

The researchers have implemented a specification monitoring and enforcement tool to test the fairness of several generative AI models. This tool allows them to evaluate the fairness of the generated content and the inherent fairness of the AI models themselves.

Critical Analysis

The paper presents a comprehensive framework for defining and enforcing fairness in generative AI systems, which is a crucial issue as these systems become increasingly capable and prevalent. The researchers have thoughtfully considered the complexities of fairness, including the challenges of intersectionality and the trade-offs involved in different fairness enforcement approaches.

One potential limitation of the research is the extent to which the proposed fairness definitions and enforcement mechanisms can be practically implemented and scaled to the vast and diverse outputs of modern generative AI systems. The paper acknowledges the combinatorial explosion of fairness considerations when dealing with multiple attributes simultaneously, and it remains to be seen how effectively the "lazy fairness enforcement" approach can address this challenge in real-world deployments.

Additionally, the paper's focus is primarily on the technical aspects of fairness, and it does not delve deeply into the broader societal implications and ethical considerations surrounding the use of generative AI. Further research may be needed to explore the alignment of AI development with diverse human values and the potential trade-offs between fair representations and other desirable AI capabilities.

Overall, this paper provides a valuable contribution to the field of AI fairness by formally defining the problem and proposing solutions, but continued interdisciplinary collaboration and public discourse will be crucial to ensure that the development of generative AI systems truly benefits society in an equitable manner.


This paper presents a formal framework for defining and enforcing fairness in generative AI systems, a critical issue as these systems become more advanced and influential. By proposing two levels of fairness ā€“ fairness in the generated sequences and inherent fairness of the generative AI model ā€“ the researchers have laid the groundwork for monitoring and improving the fairness of these systems.

The exploration of relative intersectional fairness and lazy fairness enforcement highlights the complexities involved in ensuring fairness across multiple attributes and the need for practical and scalable solutions. While the paper focuses primarily on the technical aspects, it underscores the broader societal implications and the importance of aligning the development of generative AI with diverse human values.

As generative AI continues to shape our world, this research provides a valuable foundation for ongoing efforts to mitigate the risks of reinforcing or exacerbating societal biases and inequalities. By rigorously defining and enforcing fairness, the field can work towards developing generative AI systems that truly benefit all members of society.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers


Fair by design: A sociotechnical approach to justifying the fairness of AI-enabled systems across the lifecycle

Marten H. L. Kaas, Christopher Burr, Zoe Porter, Berk Ozturk, Philippa Ryan, Michael Katell, Nuala Polo, Kalle Westerling, Ibrahim Habli





Fairness is one of the most commonly identified ethical principles in existing AI guidelines, and the development of fair AI-enabled systems is required by new and emerging AI regulation. But most approaches to addressing the fairness of AI-enabled systems are limited in scope in two significant ways: their substantive content focuses on statistical measures of fairness, and they do not emphasize the need to identify and address fairness considerations across the whole AI lifecycle. Our contribution is to present an assurance framework and tool that can enable a practical and transparent method for widening the scope of fairness considerations across the AI lifecycle and move the discussion beyond mere statistical notions of fairness to consider a richer analysis in a practical and context-dependent manner. To illustrate this approach, we first describe and then apply the framework of Trustworthy and Ethical Assurance (TEA) to an AI-enabled clinical diagnostic support system (CDSS) whose purpose is to help clinicians predict the risk of developing hypertension in patients with Type 2 diabetes, a context in which several fairness considerations arise (e.g., discrimination against patient subgroups). This is supplemented by an open-source tool and a fairness considerations map to help facilitate reasoning about the fairness of AI-enabled systems in a participatory way. In short, by using a shared framework for identifying, documenting and justifying fairness considerations, and then using this deliberative exercise to structure an assurance case, research on AI fairness becomes reusable and generalizable for others in the ethical AI community and for sharing best practices for achieving fairness and equity in digital health and healthcare in particular.

Read more



The Impossibility of Fair LLMs

Jacy Anthis, Kristian Lum, Michael Ekstrand, Avi Feller, Alexander D'Amour, Chenhao Tan





The need for fair AI is increasingly clear in the era of general-purpose systems such as ChatGPT, Gemini, and other large language models (LLMs). However, the increasing complexity of human-AI interaction and its social impacts have raised questions of how fairness standards could be applied. Here, we review the technical frameworks that machine learning researchers have used to evaluate fairness, such as group fairness and fair representations, and find that their application to LLMs faces inherent limitations. We show that each framework either does not logically extend to LLMs or presents a notion of fairness that is intractable for LLMs, primarily due to the multitudes of populations affected, sensitive attributes, and use cases. To address these challenges, we develop guidelines for the more realistic goal of achieving fairness in particular use cases: the criticality of context, the responsibility of LLM developers, and the need for stakeholder participation in an iterative process of design and evaluation. Moreover, it may eventually be possible and even necessary to use the general-purpose capabilities of AI systems to address fairness challenges as a form of scalable AI-assisted alignment.

Read more


Mapping the Potential of Explainable Artificial Intelligence (XAI) for Fairness Along the AI Lifecycle

Mapping the Potential of Explainable Artificial Intelligence (XAI) for Fairness Along the AI Lifecycle

Luca Deck, Astrid Schomacker, Timo Speith, Jakob Schoffer, Lena Kastner, Niklas Kuhl





The widespread use of artificial intelligence (AI) systems across various domains is increasingly surfacing issues related to algorithmic fairness, especially in high-stakes scenarios. Thus, critical considerations of how fairness in AI systems might be improved -- and what measures are available to aid this process -- are overdue. Many researchers and policymakers see explainable AI (XAI) as a promising way to increase fairness in AI systems. However, there is a wide variety of XAI methods and fairness conceptions expressing different desiderata, and the precise connections between XAI and fairness remain largely nebulous. Besides, different measures to increase algorithmic fairness might be applicable at different points throughout an AI system's lifecycle. Yet, there currently is no coherent mapping of fairness desiderata along the AI lifecycle. In this paper, we we distill eight fairness desiderata, map them along the AI lifecycle, and discuss how XAI could help address each of them. We hope to provide orientation for practical applications and to inspire XAI research specifically focused on these fairness desiderata.

Read more


Evaluating AI Group Fairness: a Fuzzy Logic Perspective

Evaluating AI Group Fairness: a Fuzzy Logic Perspective

Emmanouil Krasanakis, Symeon Papadopoulos





Artificial intelligence systems often address fairness concerns by evaluating and mitigating measures of group discrimination, for example that indicate biases against certain genders or races. However, what constitutes group fairness depends on who is asked and the social context, whereas definitions are often relaxed to accept small deviations from the statistical constraints they set out to impose. Here we decouple definitions of group fairness both from the context and from relaxation-related uncertainty by expressing them in the axiomatic system of Basic fuzzy Logic (BL) with loosely understood predicates, like encountering group members. We then evaluate the definitions in subclasses of BL, such as Product or Lukasiewicz logics. Evaluation produces continuous instead of binary truth values by choosing the logic subclass and truth values for predicates that reflect uncertain context-specific beliefs, such as stakeholder opinions gathered through questionnaires. Internally, it follows logic-specific rules to compute the truth values of definitions. We show that commonly held propositions standardize the resulting mathematical formulas and we transcribe logic and truth value choices to layperson terms, so that anyone can answer them. We also use our framework to study several literature definitions of algorithmic fairness, for which we rationalize previous expedient practices that are non-probabilistic and show how to re-interpret their formulas and parameters in new contexts.

Read more
