Differentially Private Post-Processing for Fair Regression

2405.04034

Published 5/8/2024 by Ruicheng Xian, Qiaobo Li, Gautam Kamath, Han Zhao

↗️

Abstract

This paper describes a differentially private post-processing algorithm for learning fair regressors satisfying statistical parity, addressing privacy concerns of machine learning models trained on sensitive data, as well as fairness concerns of their potential to propagate historical biases. Our algorithm can be applied to post-process any given regressor to improve fairness by remapping its outputs. It consists of three steps: first, the output distributions are estimated privately via histogram density estimation and the Laplace mechanism, then their Wasserstein barycenter is computed, and the optimal transports to the barycenter are used for post-processing to satisfy fairness. We analyze the sample complexity of our algorithm and provide fairness guarantee, revealing a trade-off between the statistical bias and variance induced from the choice of the number of bins in the histogram, in which using less bins always favors fairness at the expense of error.

Create account to get full access

Overview

This paper proposes a differentially private algorithm for learning fair machine learning models that satisfy statistical parity.
The algorithm addresses both privacy concerns of training on sensitive data and fairness concerns of propagating historical biases.
It consists of three steps: estimating output distributions, computing a Wasserstein barycenter, and using optimal transports to satisfy fairness.
The algorithm provides a trade-off between statistical bias and variance by adjusting the number of bins used in the histogram.

Plain English Explanation

The paper presents a way to make machine learning models more fair and protect people's privacy. Machine learning models trained on sensitive data can sometimes reflect historical biases, leading to unfair outcomes. This algorithm helps fix that by post-processing the model's outputs to satisfy statistical parity - ensuring the model treats different groups equally.

To do this, the algorithm first estimates the distribution of the model's outputs privately using a technique called histogram density estimation. It then computes an "average" of these distributions, called a Wasserstein barycenter. Finally, it adjusts the model's outputs to match this average distribution, ensuring fairness.

The key trade-off is that using fewer bins in the histogram leads to more fairness but more error in the model's predictions. So the algorithm allows tuning this parameter to balance fairness and accuracy.

Technical Explanation

The core of the algorithm is a three-step process:

Privately Estimate Output Distributions: The algorithm first estimates the distribution of the model's outputs for each group using a differentially private histogram density estimation technique. This involves adding noise to the histogram to protect privacy.
Compute Wasserstein Barycenter: Next, it computes the Wasserstein barycenter - a statistical summary that represents the "average" of these group-specific distributions. The Wasserstein distance is a metric that quantifies the difference between distributions.
Post-process using Optimal Transports: Finally, the algorithm finds the optimal transport maps to remap the model's outputs to match the Wasserstein barycenter, ensuring statistical parity across groups.

The authors provide a theoretical analysis of the sample complexity and fairness guarantees of this algorithm. They show a trade-off between statistical bias and variance, where using fewer histogram bins favors fairness but can increase error.

Critical Analysis

The paper presents a thoughtful approach to addressing both privacy and fairness concerns in machine learning models. The authors acknowledge that there is an inherent tension between fairness and model accuracy, and their algorithm allows tuning this trade-off.

One potential limitation is the reliance on group-level notions of fairness, like statistical parity. While this is a common fairness metric, there are other perspectives, like individual fairness, that may be worth considering. The authors could explore how their technique might extend to these other fairness notions.

Additionally, the paper focuses on post-processing a pre-trained model. An interesting area for future work could be to integrate this fairness-promoting technique directly into the training process, potentially leading to more robust and fair models from the ground up.

Overall, this is a valuable contribution that tackles important challenges at the intersection of privacy, fairness, and machine learning. The technical insights and algorithmic framework provide a solid foundation for further research and real-world applications.

Conclusion

This paper presents a differentially private algorithm for learning fair machine learning models that satisfy statistical parity. By estimating output distributions, computing a Wasserstein barycenter, and using optimal transports, the algorithm can post-process a pre-trained model to improve fairness while managing the trade-off with model accuracy.

The approach is an important step towards building machine learning systems that are both private and fair, addressing critical concerns around the propagation of historical biases. Further research could explore integrating fairness directly into the training process and extending the techniques to other notions of fairness beyond statistical parity.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Optimal Group Fair Classifiers from Linear Post-Processing

Ruicheng Xian, Han Zhao

We propose a post-processing algorithm for fair classification that mitigates model bias under a unified family of group fairness criteria covering statistical parity, equal opportunity, and equalized odds, applicable to multi-class problems and both attribute-aware and attribute-blind settings. It achieves fairness by re-calibrating the output score of the given base model with a fairness cost -- a linear combination of the (predicted) group memberships. Our algorithm is based on a representation result showing that the optimal fair classifier can be expressed as a linear post-processing of the loss function and the group predictor, derived via using these as sufficient statistics to reformulate the fair classification problem as a linear program. The parameters of the post-processor are estimated by solving the empirical LP. Experiments on benchmark datasets show the efficiency and effectiveness of our algorithm at reducing disparity compared to existing algorithms, including in-processing, especially on larger problems.

5/8/2024

cs.LG cs.CY

🏅

Differentially Private Fair Binary Classifications

Hrad Ghoukasian, Shahab Asoodeh

In this work, we investigate binary classification under the constraints of both differential privacy and fairness. We first propose an algorithm based on the decoupling technique for learning a classifier with only fairness guarantee. This algorithm takes in classifiers trained on different demographic groups and generates a single classifier satisfying statistical parity. We then refine this algorithm to incorporate differential privacy. The performance of the final algorithm is rigorously examined in terms of privacy, fairness, and utility guarantees. Empirical evaluations conducted on the Adult and Credit Card datasets illustrate that our algorithm outperforms the state-of-the-art in terms of fairness guarantees, while maintaining the same level of privacy and utility.

5/21/2024

cs.LG cs.CR cs.IT stat.ML

FRAPPE: A Group Fairness Framework for Post-Processing Everything

Alexandru Tifrea, Preethi Lahoti, Ben Packer, Yoni Halpern, Ahmad Beirami, Flavien Prost

Despite achieving promising fairness-error trade-offs, in-processing mitigation techniques for group fairness cannot be employed in numerous practical applications with limited computation resources or no access to the training pipeline of the prediction model. In these situations, post-processing is a viable alternative. However, current methods are tailored to specific problem settings and fairness definitions and hence, are not as broadly applicable as in-processing. In this work, we propose a framework that turns any regularized in-processing method into a post-processing approach. This procedure prescribes a way to obtain post-processing techniques for a much broader range of problem settings than the prior post-processing literature. We show theoretically and through extensive experiments that our framework preserves the good fairness-error trade-offs achieved with in-processing and can improve over the effectiveness of prior post-processing methods. Finally, we demonstrate several advantages of a modular mitigation strategy that disentangles the training of the prediction model from the fairness mitigation, including better performance on tasks with partial group labels.

6/21/2024

cs.LG cs.CY

📈

Metrizing Fairness

Yves Rychener, Bahar Taskesen, Daniel Kuhn

We study supervised learning problems that have significant effects on individuals from two demographic groups, and we seek predictors that are fair with respect to a group fairness criterion such as statistical parity (SP). A predictor is SP-fair if the distributions of predictions within the two groups are close in Kolmogorov distance, and fairness is achieved by penalizing the dissimilarity of these two distributions in the objective function of the learning problem. In this paper, we identify conditions under which hard SP constraints are guaranteed to improve predictive accuracy. We also showcase conceptual and computational benefits of measuring unfairness with integral probability metrics (IPMs) other than the Kolmogorov distance. Conceptually, we show that the generator of any IPM can be interpreted as a family of utility functions and that unfairness with respect to this IPM arises if individuals in the two demographic groups have diverging expected utilities. We also prove that the unfairness-regularized prediction loss admits unbiased gradient estimators, which are constructed from random mini-batches of training samples, if unfairness is measured by the squared $mathcal L^2$-distance or by a squared maximum mean discrepancy. In this case, the fair learning problem is susceptible to efficient stochastic gradient descent (SGD) algorithms. Numerical experiments on synthetic and real data show that these SGD algorithms outperform state-of-the-art methods for fair learning in that they achieve superior accuracy-unfairness trade-offs -- sometimes orders of magnitude faster.

6/12/2024

cs.LG stat.ML