Controlling Counterfactual Harm in Decision Support Systems Based on Prediction Sets

2406.06671

Published 6/12/2024 by Eleni Straitouri, Suhas Thejaswi, Manuel Gomez Rodriguez

Controlling Counterfactual Harm in Decision Support Systems Based on Prediction Sets

Abstract

Decision support systems based on prediction sets help humans solve multiclass classification tasks by narrowing down the set of potential label values to a subset of them, namely a prediction set, and asking them to always predict label values from the prediction sets. While this type of systems have been proven to be effective at improving the average accuracy of the predictions made by humans, by restricting human agency, they may cause harm$unicode{x2014}$a human who has succeeded at predicting the ground-truth label of an instance on their own may have failed had they used these systems. In this paper, our goal is to control how frequently a decision support system based on prediction sets may cause harm, by design. To this end, we start by characterizing the above notion of harm using the theoretical framework of structural causal models. Then, we show that, under a natural, albeit unverifiable, monotonicity assumption, we can estimate how frequently a system may cause harm using only predictions made by humans on their own. Further, we also show that, under a weaker monotonicity assumption, which can be verified experimentally, we can bound how frequently a system may cause harm again using only predictions made by humans on their own. Building upon these assumptions, we introduce a computational framework to design decision support systems based on prediction sets that are guaranteed to cause harm less frequently than a user-specified value using conformal risk control. We validate our framework using real human predictions from two different human subject studies and show that, in decision support systems based on prediction sets, there is a trade-off between accuracy and counterfactual harm.

Create account to get full access

Overview

This paper explores how to control for counterfactual harm in decision support systems based on prediction sets.
Prediction sets provide a range of possible outcomes for a given input, rather than a single prediction.
The authors propose techniques to ensure these prediction sets do not lead to undesirable or unfair outcomes.

Plain English Explanation

Decision support systems are tools that help humans make decisions by providing predictions or recommendations. Traditionally, these systems have provided a single prediction for each input. However, recent research has shown that providing a set of possible outcomes, or a "prediction set," can improve human decision-making.

The challenge is that these prediction sets could potentially lead to unintended consequences or unfair outcomes - a phenomenon known as "counterfactual harm." For example, a medical decision support system might provide a prediction set that includes both beneficial and harmful treatments, and a patient could be steered towards the harmful options.

This paper proposes ways to control for this counterfactual harm in decision support systems based on prediction sets. The key ideas are to:

Explicitly model the potential for counterfactual harm when generating the prediction sets.
Develop algorithms that optimize the prediction sets to minimize this counterfactual harm.

By taking these steps, the authors aim to ensure that decision support systems provide helpful, unbiased guidance to users without exposing them to undesirable outcomes.

Technical Explanation

The paper starts by defining the problem of counterfactual harm in the context of decision support systems based on prediction sets. Formally, they model the decision-making process as a Markov Decision Process, where the prediction set represents the available actions.

The authors then propose two approaches to control for counterfactual harm in this setting:

Constrained Prediction Set Optimization: This involves explicitly modeling the potential for counterfactual harm as a constraint when generating the prediction sets. The goal is to find the prediction set that maximizes utility while minimizing the risk of counterfactual harm.
Adversarial Prediction Set Optimization: Here, the authors introduce an adversarial component that tries to find the "worst-case" prediction set that could lead to the most counterfactual harm. The system is then trained to be robust to this adversarial attack, ensuring the final prediction sets are safe and reliable.

The paper evaluates these techniques on both synthetic and real-world datasets, demonstrating their effectiveness in mitigating counterfactual harm while maintaining the benefits of prediction sets.

Critical Analysis

The authors acknowledge that their proposed approaches rely on accurate modeling of counterfactual harm, which can be challenging in practice. Additionally, the paper does not address the potential for data poisoning attacks that could skew the prediction sets and lead to counterfactual harm.

It would also be interesting to see how these techniques could be extended beyond the Markov Decision Process framework, as real-world decision support systems may involve more complex decision-making processes.

Overall, this paper presents a valuable contribution to the ongoing research on improving the safety and reliability of decision support systems. By focusing on the critical issue of counterfactual harm, the authors highlight an important consideration for the responsible development of these systems.

Conclusion

This paper addresses the challenge of controlling counterfactual harm in decision support systems that rely on prediction sets. By explicitly modeling and optimizing for counterfactual harm, the proposed techniques aim to ensure these systems provide helpful and unbiased guidance to users without exposing them to undesirable outcomes.

While the approaches have some limitations, this research represents an important step towards developing more trustworthy and reliable decision support systems that can truly complement and empower human decision-makers.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Towards Human-AI Complementarity with Predictions Sets

Giovanni De Toni, Nastaran Okati, Suhas Thejaswi, Eleni Straitouri, Manuel Gomez-Rodriguez

Decision support systems based on prediction sets have proven to be effective at helping human experts solve classification tasks. Rather than providing single-label predictions, these systems provide sets of label predictions constructed using conformal prediction, namely prediction sets, and ask human experts to predict label values from these sets. In this paper, we first show that the prediction sets constructed using conformal prediction are, in general, suboptimal in terms of average accuracy. Then, we show that the problem of finding the optimal prediction sets under which the human experts achieve the highest average accuracy is NP-hard. More strongly, unless P = NP, we show that the problem is hard to approximate to any factor less than the size of the label set. However, we introduce a simple and efficient greedy algorithm that, for a large class of expert models and non-conformity scores, is guaranteed to find prediction sets that provably offer equal or greater performance than those constructed using conformal prediction. Further, using a simulation study with both synthetic and real expert predictions, we demonstrate that, in practice, our greedy algorithm finds near-optimal prediction sets offering greater performance than conformal prediction.

5/29/2024

cs.LG cs.CY cs.HC

Conformal Prediction Sets Improve Human Decision Making

Jesse C. Cresswell, Yi Sui, Bhargava Kumar, Noel Vouitsis

In response to everyday queries, humans explicitly signal uncertainty and offer alternative answers when they are unsure. Machine learning models that output calibrated prediction sets through conformal prediction mimic this human behaviour; larger sets signal greater uncertainty while providing alternatives. In this work, we study the usefulness of conformal prediction sets as an aid for human decision making by conducting a pre-registered randomized controlled trial with conformal prediction sets provided to human subjects. With statistical significance, we find that when humans are given conformal prediction sets their accuracy on tasks improves compared to fixed-size prediction sets with the same coverage guarantee. The results show that quantifying model uncertainty with conformal prediction is helpful for human-in-the-loop decision making and human-AI teams.

6/11/2024

cs.LG cs.HC stat.ML

📊

The Effect of Data Poisoning on Counterfactual Explanations

Andr'e Artelt, Shubham Sharma, Freddy Lecu'e, Barbara Hammer

Counterfactual explanations provide a popular method for analyzing the predictions of black-box systems, and they can offer the opportunity for computational recourse by suggesting actionable changes on how to change the input to obtain a different (i.e. more favorable) system output. However, recent work highlighted their vulnerability to different types of manipulations. This work studies the vulnerability of counterfactual explanations to data poisoning. We formally introduce and investigate data poisoning in the context of counterfactual explanations for increasing the cost of recourse on three different levels: locally for a single instance, or a sub-group of instances, or globally for all instances. In this context, we characterize and prove the correctness of several different data poisonings. We also empirically demonstrate that state-of-the-art counterfactual generation methods and toolboxes are vulnerable to such data poisoning.

5/22/2024

cs.LG cs.AI

🤯

From algorithms to action: improving patient care requires causality

Wouter A. C. van Amsterdam, Pim A. de Jong, Joost J. C. Verhoeff, Tim Leiner, Rajesh Ranganath

In cancer research there is much interest in building and validating outcome predicting outcomes to support treatment decisions. However, because most outcome prediction models are developed and validated without regard to the causal aspects of treatment decision making, many published outcome prediction models may cause harm when used for decision making, despite being found accurate in validation studies. Guidelines on prediction model validation and the checklist for risk model endorsement by the American Joint Committee on Cancer do not protect against prediction models that are accurate during development and validation but harmful when used for decision making. We explain why this is the case and how to build and validate models that are useful for decision making.

4/3/2024

cs.LG cs.CY