FALE: Fairness-Aware ALE Plots for Auditing Bias in Subgroups

Read original: arXiv:2404.18685 - Published 4/30/2024 by Giorgos Giannopoulos, Dimitris Sacharidis, Nikolas Theologitis, Loukas Kavouras, Ioannis Emiris

FALE: Fairness-Aware ALE Plots for Auditing Bias in Subgroups

Overview

Introduces a new method called FALE (Fairness-Aware ALE Plots) for auditing bias in machine learning models across different subgroups
FALE extends the Accumulated Local Effects (ALE) plot technique to provide fairness-aware visualizations that can identify biases in how a model responds to input features
Demonstrates the use of FALE on several real-world datasets to uncover biases that would be difficult to detect using standard model evaluation metrics

Plain English Explanation

The paper proposes a new tool called FALE (Fairness-Aware ALE Plots) that can help identify biases in how machine learning models make predictions across different subgroups of the population.

FALE is based on a technique called Accumulated Local Effects (ALE) plots, which can visualize how a model's predictions change as an input feature is varied. The key innovation of FALE is that it extends ALE plots to specifically look for differences in how the model is behaving for different demographic subgroups, such as men vs. women or different racial groups.

By using FALE, researchers and developers can uncover biases that might not be detected by simply looking at overall model performance metrics. For example, a model might perform well on average, but FALE could reveal that it is much less accurate for certain subgroups. This type of insight is critical for building fair and equitable AI systems that don't disadvantage particular populations.

The paper demonstrates FALE on several real-world datasets, showing how it can identify subgroup-level biases that would be hard to spot otherwise. This highlights the value of having specialized tools like FALE to enhance group fairness and personalize fairness in machine learning applications.

Technical Explanation

The key technical innovation of this paper is the development of FALE (Fairness-Aware ALE Plots), which extends the Accumulated Local Effects (ALE) plot technique to enable fairness auditing across different demographic subgroups.

ALE plots are a way of visualizing how the predictions of a machine learning model change as an input feature is varied, while holding all other features constant. This provides insights into the model's behavior that are difficult to glean from standard performance metrics alone.

FALE builds on ALE plots by separately computing the ALE curves for different subgroups defined by protected attributes like gender or race. This allows the visualization to highlight differences in how the model is responding to feature changes across these subgroups, potentially uncovering biases.

The paper demonstrates FALE on several real-world datasets, including mortgage application and [criminal risk assessment] datasets. The results show that FALE can identify nuanced biases that would be missed by just looking at overall model performance.

For example, the FALE plots may reveal that the model is much more sensitive to a particular feature (e.g. income) for one subgroup compared to another, even if the overall prediction accuracy is similar. This type of insight is critical for building fair and equitable AI systems that don't systematically disadvantage certain populations.

Critical Analysis

The FALE method proposed in this paper represents an important step forward in tools for auditing the fairness of machine learning models. By extending the powerful ALE plot technique to explicitly consider subgroup differences, FALE provides a valuable new lens for uncovering biases that might otherwise go undetected.

That said, the paper does acknowledge some limitations of the approach. FALE relies on having access to relevant subgroup information (e.g. gender, race) in the dataset, which may not always be available or easy to obtain. Additionally, the method assumes that the subgroups of interest are known in advance, rather than allowing for the exploration of more complex, intersectional categories.

There is also the broader question of how to interpret and act on the insights provided by FALE. While the visualizations can highlight problematic differences in model behavior, translating those findings into concrete fairness interventions remains a challenge. Further research is needed on frameworks for using fairness-aware model explanations like FALE to drive model improvements.

Overall, though, this paper makes an important contribution by introducing a new powerful tool for the crucial task of auditing the fairness of machine learning systems. As AI systems become increasingly prevalent in high-stakes decision making, methods like FALE will only grow in importance for ensuring these technologies are equitable and unbiased.

Conclusion

The FALE (Fairness-Aware ALE Plots) method proposed in this paper represents a significant advance in tools for auditing the fairness of machine learning models. By extending the Accumulated Local Effects (ALE) plot technique to explicitly consider differences across demographic subgroups, FALE provides a powerful new way to uncover biases that might otherwise be missed.

The paper demonstrates the use of FALE on several real-world datasets, showing how it can reveal nuanced, subgroup-level biases that would be difficult to detect using standard model performance metrics alone. This highlights the critical importance of having specialized fairness analysis tools like FALE to support the development of fair and equitable AI systems that don't disadvantage particular populations.

While FALE has some limitations, such as the need for subgroup information in the dataset, it represents an important step forward in the ongoing effort to build more transparent and accountable machine learning models. As AI continues to play an increasingly prominent role in high-stakes decision-making, methods like FALE will only grow in importance for ensuring these technologies are serving all members of society equitably.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

FALE: Fairness-Aware ALE Plots for Auditing Bias in Subgroups

Giorgos Giannopoulos, Dimitris Sacharidis, Nikolas Theologitis, Loukas Kavouras, Ioannis Emiris

Fairness is steadily becoming a crucial requirement of Machine Learning (ML) systems. A particularly important notion is subgroup fairness, i.e., fairness in subgroups of individuals that are defined by more than one attributes. Identifying bias in subgroups can become both computationally challenging, as well as problematic with respect to comprehensibility and intuitiveness of the finding to end users. In this work we focus on the latter aspects; we propose an explainability method tailored to identifying potential bias in subgroups and visualizing the findings in a user friendly manner to end users. In particular, we extend the ALE plots explainability method, proposing FALE (Fairness aware Accumulated Local Effects) plots, a method for measuring the change in fairness for an affected population corresponding to different values of a feature (attribute). We envision FALE to function as an efficient, user friendly, comprehensible and reliable first-stage tool for identifying subgroups with potential bias issues.

4/30/2024

Procedural Fairness in Machine Learning

Ziming Wang, Changwu Huang, Xin Yao

Fairness in machine learning (ML) has received much attention. However, existing studies have mainly focused on the distributive fairness of ML models. The other dimension of fairness, i.e., procedural fairness, has been neglected. In this paper, we first define the procedural fairness of ML models, and then give formal definitions of individual and group procedural fairness. We propose a novel metric to evaluate the group procedural fairness of ML models, called $GPF_{FAE}$, which utilizes a widely used explainable artificial intelligence technique, namely feature attribution explanation (FAE), to capture the decision process of the ML models. We validate the effectiveness of $GPF_{FAE}$ on a synthetic dataset and eight real-world datasets. Our experiments reveal the relationship between procedural and distributive fairness of the ML model. Based on our analysis, we propose a method for identifying the features that lead to the procedural unfairness of the model and propose two methods to improve procedural fairness after identifying unfair features. Our experimental results demonstrate that we can accurately identify the features that lead to procedural unfairness in the ML model, and both of our proposed methods can significantly improve procedural fairness with a slight impact on model performance, while also improving distributive fairness.

4/3/2024

🌐

When mitigating bias is unfair: multiplicity and arbitrariness in algorithmic group fairness

Natasa Krco, Thibault Laugel, Vincent Grari, Jean-Michel Loubes, Marcin Detyniecki

Most research on fair machine learning has prioritized optimizing criteria such as Demographic Parity and Equalized Odds. Despite these efforts, there remains a limited understanding of how different bias mitigation strategies affect individual predictions and whether they introduce arbitrariness into the debiasing process. This paper addresses these gaps by exploring whether models that achieve comparable fairness and accuracy metrics impact the same individuals and mitigate bias in a consistent manner. We introduce the FRAME (FaiRness Arbitrariness and Multiplicity Evaluation) framework, which evaluates bias mitigation through five dimensions: Impact Size (how many people were affected), Change Direction (positive versus negative changes), Decision Rates (impact on models' acceptance rates), Affected Subpopulations (who was affected), and Neglected Subpopulations (where unfairness persists). This framework is intended to help practitioners understand the impacts of debiasing processes and make better-informed decisions regarding model selection. Applying FRAME to various bias mitigation approaches across key datasets allows us to exhibit significant differences in the behaviors of debiasing methods. These findings highlight the limitations of current fairness criteria and the inherent arbitrariness in the debiasing process.

5/24/2024

AIM: Attributing, Interpreting, Mitigating Data Unfairness

Zhining Liu, Ruizhong Qiu, Zhichen Zeng, Yada Zhu, Hendrik Hamann, Hanghang Tong

Data collected in the real world often encapsulates historical discrimination against disadvantaged groups and individuals. Existing fair machine learning (FairML) research has predominantly focused on mitigating discriminative bias in the model prediction, with far less effort dedicated towards exploring how to trace biases present in the data, despite its importance for the transparency and interpretability of FairML. To fill this gap, we investigate a novel research problem: discovering samples that reflect biases/prejudices from the training data. Grounding on the existing fairness notions, we lay out a sample bias criterion and propose practical algorithms for measuring and countering sample bias. The derived bias score provides intuitive sample-level attribution and explanation of historical bias in data. On this basis, we further design two FairML strategies via sample-bias-informed minimal data editing. They can mitigate both group and individual unfairness at the cost of minimal or zero predictive utility loss. Extensive experiments and analyses on multiple real-world datasets demonstrate the effectiveness of our methods in explaining and mitigating unfairness. Code is available at https://github.com/ZhiningLiu1998/AIM.

6/19/2024