Prediction without Preclusion: Recourse Verification with Reachable Sets

2308.12820

Published 5/2/2024 by Avni Kothari, Bogdan Kulynych, Tsui-Wei Weng, Berk Ustun

🔮

Abstract

Machine learning models are often used to decide who receives a loan, a job interview, or a public benefit. Models in such settings use features without considering their actionability. As a result, they can assign predictions that are fixed $-$ meaning that individuals who are denied loans and interviews are, in fact, precluded from access to credit and employment. In this work, we introduce a procedure called recourse verification to test if a model assigns fixed predictions to its decision subjects. We propose a model-agnostic approach for recourse verification with reachable sets $-$ i.e., the set of all points that a person can reach through their actions in feature space. We develop methods to construct reachable sets for discrete feature spaces, which can certify the responsiveness of any model by simply querying its predictions. We conduct a comprehensive empirical study on the infeasibility of recourse on datasets from consumer finance. Our results highlight how models can inadvertently preclude access by assigning fixed predictions and underscore the need to account for actionability in model development.

Create account to get full access

Overview

Machine learning models are often used to make important decisions about people's access to loans, jobs, and public benefits.
These models use various features about individuals without considering whether those features are actually actionable or changeable by the individual.
As a result, the models can assign fixed predictions, meaning some people are permanently denied access to credit, employment, and other opportunities.

Plain English Explanation

In today's world, machine learning models are frequently used to make decisions that significantly impact people's lives. For example, these models might determine who gets approved for a loan, who gets invited to a job interview, or who receives a public benefit. The problem is that these models often use personal characteristics or features about the individuals, without considering whether those features are actually things the person can change or control.

This can lead to a concerning situation where the model assigns a fixed prediction about a person - for instance, denying them a loan or a job interview. Even if the person works hard to improve the features the model is looking at, the model won't change its decision. In essence, these individuals are permanently precluded from accessing things like credit or employment opportunities.

To address this issue, the researchers introduce a new approach called "recourse verification." The goal is to test whether a model is assigning fixed predictions, or if there's a way for people to take actions that would lead to a different outcome. The researchers develop methods to map out the "reachable set" - the set of all points in the feature space that a person can realistically achieve through their own efforts. By analyzing these reachable sets, they can certify whether a model is responsive to changes made by the individual, or if it has assigned fixed, unchangeable predictions.

Technical Explanation

The paper presents a procedure called "recourse verification" to test if machine learning models assign fixed predictions to their decision subjects. The authors propose a model-agnostic approach for recourse verification using "reachable sets" - the set of all points in the feature space that a person can reach through their own actions.

The researchers develop methods to construct reachable sets for discrete feature spaces, which allows them to certify the responsiveness of any model by simply querying its predictions. They conduct a comprehensive empirical study on datasets from consumer finance, which highlights how models can inadvertently preclude access by assigning fixed predictions.

The key technical contributions include:

A formal definition and characterization of "recourse" and "reachable sets" for discrete feature spaces.
Algorithms to efficiently compute reachable sets for discrete features.
Experimental validation of the recourse verification approach on real-world datasets, demonstrating the prevalence of fixed predictions.

Critical Analysis

The paper raises an important issue with the use of machine learning models for high-stakes decision-making. By not accounting for the actionability of features, these models can effectively lock individuals out of opportunities, even if they work to improve the relevant characteristics.

One limitation of the current work is that it focuses on discrete feature spaces. While this is a meaningful first step, many real-world applications involve continuous features, which may require different techniques for reachable set construction and analysis. Additionally, the paper does not delve into the ethical implications of models making fixed predictions, which is an area that deserves further exploration.

Future research could investigate methods for incorporating causality and sparsity into the recourse verification process, or explore approaches for probabilistic dataset reconstruction to better understand the underlying data distribution. Techniques for generating counterfactual explanations could also be leveraged to provide individuals with actionable insights on how to improve their outcomes.

Conclusion

This paper highlights a critical issue with the use of machine learning models in high-stakes decision-making. By not considering the actionability of features, these models can effectively preclude individuals from accessing important opportunities like loans, jobs, and public benefits. The proposed recourse verification approach provides a way to test for this problem and identify models that are assigning fixed, unchangeable predictions.

Moving forward, it will be essential for model developers to prioritize actionability and causal relationships when designing machine learning systems that make decisions about people's lives. Addressing these concerns can help ensure that these powerful technologies are used in a way that promotes fairness and expands access to critical resources.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Relevance-aware Algorithmic Recourse

Dongwhi Kim, Nuno Moniz

As machine learning continues to gain prominence, transparency and explainability are increasingly critical. Without an understanding of these models, they can replicate and worsen human bias, adversely affecting marginalized communities. Algorithmic recourse emerges as a tool for clarifying decisions made by predictive models, providing actionable insights to alter outcomes. They answer, 'What do I have to change?' to achieve the desired result. Despite their importance, current algorithmic recourse methods treat all domain values equally, which is unrealistic in real-world settings. In this paper, we propose a novel framework, Relevance-Aware Algorithmic Recourse (RAAR), that leverages the concept of relevance in applying algorithmic recourse to regression tasks. We conducted multiple experiments on 15 datasets to outline how relevance influences recourses. Results show that relevance contributes algorithmic recourses comparable to well-known baselines, with greater efficiency and lower relative costs.

5/30/2024

cs.LG

Controlling Counterfactual Harm in Decision Support Systems Based on Prediction Sets

Eleni Straitouri, Suhas Thejaswi, Manuel Gomez Rodriguez

Decision support systems based on prediction sets help humans solve multiclass classification tasks by narrowing down the set of potential label values to a subset of them, namely a prediction set, and asking them to always predict label values from the prediction sets. While this type of systems have been proven to be effective at improving the average accuracy of the predictions made by humans, by restricting human agency, they may cause harm$unicode{x2014}$a human who has succeeded at predicting the ground-truth label of an instance on their own may have failed had they used these systems. In this paper, our goal is to control how frequently a decision support system based on prediction sets may cause harm, by design. To this end, we start by characterizing the above notion of harm using the theoretical framework of structural causal models. Then, we show that, under a natural, albeit unverifiable, monotonicity assumption, we can estimate how frequently a system may cause harm using only predictions made by humans on their own. Further, we also show that, under a weaker monotonicity assumption, which can be verified experimentally, we can bound how frequently a system may cause harm again using only predictions made by humans on their own. Building upon these assumptions, we introduce a computational framework to design decision support systems based on prediction sets that are guaranteed to cause harm less frequently than a user-specified value using conformal risk control. We validate our framework using real human predictions from two different human subject studies and show that, in decision support systems based on prediction sets, there is a trade-off between accuracy and counterfactual harm.

6/12/2024

cs.LG cs.CY cs.HC

📉

Algorithmic Recourse with Missing Values

Kentaro Kanamori, Takuya Takagi, Ken Kobayashi, Yuichi Ike

This paper proposes a new framework of algorithmic recourse (AR) that works even in the presence of missing values. AR aims to provide a recourse action for altering the undesired prediction result given by a classifier. Existing AR methods assume that we can access complete information on the features of an input instance. However, we often encounter missing values in a given instance (e.g., due to privacy concerns), and previous studies have not discussed such a practical situation. In this paper, we first empirically and theoretically show the risk that a naive approach with a single imputation technique fails to obtain good actions regarding their validity, cost, and features to be changed. To alleviate this risk, we formulate the task of obtaining a valid and low-cost action for a given incomplete instance by incorporating the idea of multiple imputation. Then, we provide some theoretical analyses of our task and propose a practical solution based on mixed-integer linear optimization. Experimental results demonstrated the efficacy of our method in the presence of missing values compared to the baselines.

5/24/2024

cs.LG stat.ML

Learning Decision Trees and Forests with Algorithmic Recourse

Kentaro Kanamori, Takuya Takagi, Ken Kobayashi, Yuichi Ike

This paper proposes a new algorithm for learning accurate tree-based models while ensuring the existence of recourse actions. Algorithmic Recourse (AR) aims to provide a recourse action for altering the undesired prediction result given by a model. Typical AR methods provide a reasonable action by solving an optimization task of minimizing the required effort among executable actions. In practice, however, such actions do not always exist for models optimized only for predictive performance. To alleviate this issue, we formulate the task of learning an accurate classification tree under the constraint of ensuring the existence of reasonable actions for as many instances as possible. Then, we propose an efficient top-down greedy algorithm by leveraging the adversarial training techniques. We also show that our proposed algorithm can be applied to the random forest, which is known as a popular framework for learning tree ensembles. Experimental results demonstrated that our method successfully provided reasonable actions to more instances than the baselines without significantly degrading accuracy and computational efficiency.

6/4/2024

cs.LG stat.ML