(De)Noise: Moderating the Inconsistency Between Human Decision-Makers

Read original: arXiv:2407.11225 - Published 7/17/2024 by Nina Grgi'c-Hlav{c}a, Junaid Ali, Krishna P. Gummadi, Jennifer Wortman Vaughan

(De)Noise: Moderating the Inconsistency Between Human Decision-Makers

Overview

This paper explores the inconsistency between human decision-makers and how algorithmic tools can help moderate this issue.
The researchers investigate the concept of "noise" - the inconsistency in decisions made by different human experts evaluating the same problem.
They propose an algorithmic approach to reduce this noise and improve the consistency of human decision-making.

Plain English Explanation

Humans often make different decisions when faced with the same problem or information. This can be a challenge in fields like medicine, law, and business, where important decisions need to be made consistently. The researchers in this paper refer to this inconsistency as "noise" in the decision-making process.

To address this issue, the researchers explore how algorithmic tools can help moderate the inconsistency between human decision-makers. They investigate ways to design algorithmic recommendations to achieve human-AI alignment and [reduce the "noise" in human decision-making.

By using algorithmic decision aids, the researchers aim to leverage human expertise and algorithmic prediction to improve the consistency of decisions made by human experts. This could have important implications for enhancing fair decision-making in various domains.

Technical Explanation

The researchers conducted experiments to understand the extent of inconsistency, or "noise," in human decision-making. They had multiple human experts evaluate the same set of decision-making tasks and found significant variability in their responses, even among highly trained professionals.

To address this issue, the researchers proposed an algorithmic approach to reduce the noise and improve the consistency of human decision-making. Their method involves training a machine learning model to predict the "correct" decision based on the inputs and the collective judgment of the human experts.

By providing this algorithmic recommendation to the human decision-makers, the researchers found that they were able to leverage the consistency of the algorithm to align the human decisions more closely with the optimal outcome. This approach can be particularly useful in high-stakes decisions where consistency and accuracy are crucial.

Critical Analysis

The researchers acknowledge several limitations and caveats in their work. They note that the effectiveness of the algorithmic approach may depend on the quality and consistency of the human experts' input data, as well as the complexity of the decision-making tasks.

Additionally, there are concerns around the potential biases and fairness implications of relying on algorithmic recommendations, particularly if the underlying data or model design inadvertently reflects societal biases.

Further research is needed to better understand the interplay between human expertise and algorithmic prediction and to explore ways to ensure that the use of algorithmic decision aids enhances, rather than undermines, the quality and fairness of human decision-making.

Conclusion

This paper highlights the significant inconsistency, or "noise," that can exist in human decision-making, even among highly trained professionals. By leveraging algorithmic tools to moderate this inconsistency, the researchers have demonstrated a promising approach to improve the consistency and accuracy of human decision-making.

This research has important implications for high-stakes decisions in fields such as medicine, law, and business, where consistent and reliable decision-making is crucial. However, further exploration is needed to address potential fairness and bias concerns and to better understand the interplay between human expertise and algorithmic prediction.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

(De)Noise: Moderating the Inconsistency Between Human Decision-Makers

Nina Grgi'c-Hlav{c}a, Junaid Ali, Krishna P. Gummadi, Jennifer Wortman Vaughan

Prior research in psychology has found that people's decisions are often inconsistent. An individual's decisions vary across time, and decisions vary even more across people. Inconsistencies have been identified not only in subjective matters, like matters of taste, but also in settings one might expect to be more objective, such as sentencing, job performance evaluations, or real estate appraisals. In our study, we explore whether algorithmic decision aids can be used to moderate the degree of inconsistency in human decision-making in the context of real estate appraisal. In a large-scale human-subject experiment, we study how different forms of algorithmic assistance influence the way that people review and update their estimates of real estate prices. We find that both (i) asking respondents to review their estimates in a series of algorithmically chosen pairwise comparisons and (ii) providing respondents with traditional machine advice are effective strategies for influencing human responses. Compared to simply reviewing initial estimates one by one, the aforementioned strategies lead to (i) a higher propensity to update initial estimates, (ii) a higher accuracy of post-review estimates, and (iii) a higher degree of consistency between the post-review estimates of different respondents. While these effects are more pronounced with traditional machine advice, the approach of reviewing algorithmically chosen pairs can be implemented in a wider range of settings, since it does not require access to ground truth data.

7/17/2024

➖

Leveraging Expert Consistency to Improve Algorithmic Decision Support

Maria De-Arteaga, Vincent Jeanselme, Artur Dubrawski, Alexandra Chouldechova

Machine learning (ML) is increasingly being used to support high-stakes decisions. However, there is frequently a construct gap: a gap between the construct of interest to the decision-making task and what is captured in proxies used as labels to train ML models. As a result, ML models may fail to capture important dimensions of decision criteria, hampering their utility for decision support. Thus, an essential step in the design of ML systems for decision support is selecting a target label among available proxies. In this work, we explore the use of historical expert decisions as a rich -- yet also imperfect -- source of information that can be combined with observed outcomes to narrow the construct gap. We argue that managers and system designers may be interested in learning from experts in instances where they exhibit consistency with each other, while learning from observed outcomes otherwise. We develop a methodology to enable this goal using information that is commonly available in organizational information systems. This involves two core steps. First, we propose an influence function-based methodology to estimate expert consistency indirectly when each case in the data is assessed by a single expert. Second, we introduce a label amalgamation approach that allows ML models to simultaneously learn from expert decisions and observed outcomes. Our empirical evaluation, using simulations in a clinical setting and real-world data from the child welfare domain, indicates that the proposed approach successfully narrows the construct gap, yielding better predictive performance than learning from either observed outcomes or expert decisions alone.

6/4/2024

Designing Algorithmic Recommendations to Achieve Human-AI Complementarity

Bryce McLaughlin, Jann Spiess

Algorithms frequently assist, rather than replace, human decision-makers. However, the design and analysis of algorithms often focus on predicting outcomes and do not explicitly model their effect on human decisions. This discrepancy between the design and role of algorithmic assistants becomes of particular concern in light of empirical evidence that suggests that algorithmic assistants again and again fail to improve human decisions. In this article, we formalize the design of recommendation algorithms that assist human decision-makers without making restrictive ex-ante assumptions about how recommendations affect decisions. We formulate an algorithmic-design problem that leverages the potential-outcomes framework from causal inference to model the effect of recommendations on a human decision-maker's binary treatment choice. Within this model, we introduce a monotonicity assumption that leads to an intuitive classification of human responses to the algorithm. Under this monotonicity assumption, we can express the human's response to algorithmic recommendations in terms of their compliance with the algorithm and the decision they would take if the algorithm sends no recommendation. We showcase the utility of our framework using an online experiment that simulates a hiring task. We argue that our approach explains the relative performance of different recommendation algorithms in the experiment, and can help design solutions that realize human-AI complementarity.

5/3/2024

📶

Technological Shocks and Algorithmic Decision Aids in Credence Goods Markets

Alexander Erlei, Lukas Meub

In credence goods markets such as health care or repair services, consumers rely on experts with superior information to adequately diagnose and treat them. Experts, however, are constrained in their diagnostic abilities, which hurts market efficiency and consumer welfare. Technological breakthroughs that substitute or complement expert judgments have the potential to alleviate consumer mistreatment. This article studies how competitive experts adopt novel diagnostic technologies when skills are heterogeneously distributed and obfuscated to consumers. We differentiate between novel technologies that increase expert abilities, and algorithmic decision aids that complement expert judgments, but do not affect an expert's personal diagnostic precision. We show that high-ability experts may be incentivized to forego the decision aid in order to escape a pooling equilibrium by differentiating themselves from low-ability experts. Results from an online experiment support our hypothesis, showing that high-ability experts are significantly less likely than low-ability experts to invest into an algorithmic decision aid. Furthermore, we document pervasive under-investments, and no effect on expert honesty.

4/16/2024