A Canonical Data Transformation for Achieving Inter- and Within-group Fairness

Read original: arXiv:2310.15097 - Published 7/9/2024 by Zachary McBride Lazri, Ivan Brugere, Xin Tian, Dana Dachman-Soled, Antigoni Polychroniadou, Danial Dervovic, Min Wu

📊

Overview

As machine learning algorithms are increasingly used for applications involving sensitive data, the issue of fairness in these algorithms has become a growing concern.
While previous work has focused on achieving fairness between different demographic groups, this can sometimes lead to unfair treatment of individuals within the same group.
This paper introduces a formal definition of "within-group fairness" and proposes a pre-processing framework to address both inter-group and within-group fairness with minimal impact on accuracy.

Plain English Explanation

Machine learning algorithms are being used more and more for important decisions that affect people's lives, such as loan applications, criminal risk assessments, and university admissions. This has raised concerns about whether these algorithms are treating different demographic groups (e.g., race, gender, age) fairly.

Previous research has looked at ways to make algorithms "fair" between groups - for example, ensuring that people of different races have an equal chance of being approved for a loan. However, the authors of this paper point out that algorithms aiming for this type of "group fairness" can sometimes end up treating individuals within the same group unfairly.

To address this, the researchers propose a new way to define and achieve "within-group fairness." Their approach ensures that individuals from the same demographic group are treated fairly relative to each other, not just that different groups are treated equally on average.

The key idea is to transform the input data in a way that preserves the relative rankings of individuals within each group, while also ensuring fairness between groups. This is done through a pre-processing step before the data is fed into the machine learning model.

The researchers test their framework on two real-world datasets and show that it can achieve both inter-group and within-group fairness with only a small decrease in overall accuracy compared to other fairness techniques.

Technical Explanation

The authors introduce a formal definition of "within-group fairness" that captures the idea of fairness among individuals within the same demographic group, in addition to fairness between groups. They propose a pre-processing framework to meet both inter-group and within-group fairness criteria without significantly compromising model accuracy.

The framework works by first mapping the input feature vectors of individuals from different groups to an "inter-group-fair canonical domain." This mapping is designed to preserve the relative ranking of individuals within each group, while also ensuring fairness between groups. The transformed feature vectors are then fed into a scoring function to make predictions.

The researchers evaluate their framework on the COMPAS risk assessment and Law School datasets. They compare its performance in achieving inter-group and within-group fairness to two existing regularization-based fairness methods. The results show that their approach can successfully satisfy both fairness criteria with only a small drop in overall accuracy compared to the other techniques.

Critical Analysis

A key strength of this work is the formal definition of within-group fairness, which addresses an important limitation of previous group fairness approaches. By preserving the relative ranking of individuals within each group, the proposed framework helps ensure that everyone is treated fairly compared to their peers, not just that different groups have equal average outcomes.

However, the paper does not explore how the choice of mapping function might impact the trade-off between fairness and accuracy. There may be more sophisticated techniques for constructing this mapping that could further improve the overall performance.

Additionally, the experiments are limited to two specific datasets. More research is needed to understand how well the framework generalizes to a wider range of applications and data types. There may also be other important fairness criteria, beyond inter-group and within-group fairness, that should be considered.

Overall, this work makes an important contribution by highlighting the need to consider fairness at the individual level, not just the group level. The proposed pre-processing approach provides a promising direction for achieving a more comprehensive notion of fairness in machine learning systems.

Conclusion

This paper introduces a novel framework for achieving both inter-group and within-group fairness in machine learning models with minimal impact on overall accuracy. By mapping input features to a canonical domain that preserves relative rankings within groups, the approach ensures that individuals are treated fairly not just compared to other demographic groups, but also within their own group.

The experiments demonstrate the effectiveness of this framework on real-world datasets, suggesting it could be a valuable tool for developing fair and equitable machine learning systems. As the use of these algorithms continues to grow, addressing fairness concerns at both the group and individual level will be crucial for building trust and ensuring these technologies have a positive impact on society.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

📊

A Canonical Data Transformation for Achieving Inter- and Within-group Fairness

Zachary McBride Lazri, Ivan Brugere, Xin Tian, Dana Dachman-Soled, Antigoni Polychroniadou, Danial Dervovic, Min Wu

Increases in the deployment of machine learning algorithms for applications that deal with sensitive data have brought attention to the issue of fairness in machine learning. Many works have been devoted to applications that require different demographic groups to be treated fairly. However, algorithms that aim to satisfy inter-group fairness (also called group fairness) may inadvertently treat individuals within the same demographic group unfairly. To address this issue, we introduce a formal definition of within-group fairness that maintains fairness among individuals from within the same group. We propose a pre-processing framework to meet both inter- and within-group fairness criteria with little compromise in accuracy. The framework maps the feature vectors of members from different groups to an inter-group-fair canonical domain before feeding them into a scoring function. The mapping is constructed to preserve the relative relationship between the scores obtained from the unprocessed feature vectors of individuals from the same demographic group, guaranteeing within-group fairness. We apply this framework to the COMPAS risk assessment and Law School datasets and compare its performance in achieving inter-group and within-group fairness to two regularization-based methods.

7/9/2024

Reranking individuals: The effect of fair classification within-groups

Sofie Goethals, Toon Calders

Artificial Intelligence (AI) finds widespread application across various domains, but it sparks concerns about fairness in its deployment. The prevailing discourse in classification often emphasizes outcome-based metrics comparing sensitive subgroups without a nuanced consideration of the differential impacts within subgroups. Bias mitigation techniques not only affect the ranking of pairs of instances across sensitive groups, but often also significantly affect the ranking of instances within these groups. Such changes are hard to explain and raise concerns regarding the validity of the intervention. Unfortunately, these effects remain under the radar in the accuracy-fairness evaluation framework that is usually applied. Additionally, we illustrate the effect of several popular bias mitigation methods, and how their output often does not reflect real-world scenarios.

5/24/2024

📊

Synthetic Data Generation for Intersectional Fairness by Leveraging Hierarchical Group Structure

Gaurav Maheshwari, Aur'elien Bellet, Pascal Denis, Mikaela Keller

In this paper, we introduce a data augmentation approach specifically tailored to enhance intersectional fairness in classification tasks. Our method capitalizes on the hierarchical structure inherent to intersectionality, by viewing groups as intersections of their parent categories. This perspective allows us to augment data for smaller groups by learning a transformation function that combines data from these parent groups. Our empirical analysis, conducted on four diverse datasets including both text and images, reveals that classifiers trained with this data augmentation approach achieve superior intersectional fairness and are more robust to ``leveling down'' when compared to methods optimizing traditional group fairness metrics.

5/24/2024

🌀

Counterpart Fairness -- Addressing Systematic between-group Differences in Fairness Evaluation

Yifei Wang, Zhengyang Zhou, Liqin Wang, John Laurentiev, Peter Hou, Li Zhou, Pengyu Hong

When using machine learning (ML) to aid decision-making, it is critical to ensure that an algorithmic decision is fair and does not discriminate against specific individuals/groups, particularly those from underprivileged populations. Existing group fairness methods aim to ensure equal outcomes (such as loan approval rates) across groups delineated by protected variables like race or gender. However, these methods overlook the intricate, inherent differences among these groups that could influence outcomes. The confounding factors, which are non-protected variables but manifest systematic differences, can significantly affect fairness evaluation. Therefore, we recommend a more refined and comprehensive approach that accounts for both the systematic differences within groups and the multifaceted, intertwined confounding effects. We proposed a fairness metric based on counterparts (i.e., individuals who are similar with respect to the task of interest) from different groups, whose group identities cannot be distinguished algorithmically by exploring confounding factors. We developed a propensity-score-based method for identifying counterparts, avoiding the issue of comparing oranges with apples. In addition, we introduced a counterpart-based statistical fairness index, called Counterpart-Fairness (CFair), to assess the fairness of ML models. Various empirical studies were conducted to validate the effectiveness of CFair.

9/6/2024