Learning to Complement and to Defer to Multiple Users

Read original: arXiv:2407.07003 - Published 7/10/2024 by Zheng Zhang, Wenjie Ai, Kevin Wells, David Rosewarne, Thanh-Toan Do, Gustavo Carneiro

Learning to Complement and to Defer to Multiple Users

Overview

This paper explores ways for AI systems to learn to work collaboratively with multiple human users, both by complementing their capabilities and deferring to their judgments when appropriate.
The researchers propose novel training approaches to enable AI models to learn to complement human users and learn to defer to their decisions in classification tasks.
The techniques could enable more effective human-AI collaboration in areas like content moderation, healthcare, and education.

Plain English Explanation

The paper looks at how AI systems can learn to work together with multiple human users in a collaborative way. The key ideas are:

Learning to Complement: The AI system should be able to learn how to complement the capabilities of different human users. For example, the AI could focus on tasks it is better at, while letting the humans handle things they are better at.
Learning to Defer: The AI system should also learn when to defer to the judgments of human users, rather than trying to make decisions on its own. This could be useful in sensitive domains like content moderation, where human oversight is important.

By developing these abilities, the researchers hope to enable more effective collaboration between humans and AI in areas like healthcare, education, and beyond. The techniques could help AI systems work alongside humans in a more seamless and complementary way.

Technical Explanation

The paper proposes two main technical approaches:

Learning to Complement: The researchers develop a multi-task learning framework where the AI model is trained to both classify the input and predict the human user's classification. By learning to predict the human's judgment, the model can then focus on areas where it can provide the most complementary value.
Learning to Defer: The team also introduces a meta-learning approach where the AI model learns to defer to the decisions of different human users based on their past performance. This allows the AI to calibrate its own confidence and defer to the most reliable human input.

These techniques were evaluated on several classification tasks, including content moderation and essay scoring. The results showed that the AI models were able to effectively complement and defer to human users, leading to performance gains compared to either humans or the AI working alone.

Critical Analysis

The paper presents a thoughtful approach to enabling more productive human-AI collaboration, with a number of promising results. However, some potential limitations and areas for further research include:

The experiments focused on relatively constrained classification tasks. Further work is needed to evaluate the techniques in more complex, real-world settings where the boundaries between human and AI capabilities may be less clear.
The paper does not address potential biases or errors that could arise from over-relying on human judgments, especially in sensitive domains like content moderation.
It's unclear how well the proposed techniques would scale to handle a large and diverse pool of human users with varying expertise and reliability.

Overall, the research represents an important step towards more effective human-AI collaboration, but additional work is needed to fully realize the potential of these approaches.

Conclusion

This paper explores innovative ways for AI systems to learn to work collaboratively with human users, both by complementing their capabilities and deferring to their judgments. The proposed techniques could enable more seamless and effective human-AI collaboration in a range of application areas, from content moderation to healthcare to education. While the research shows promising results, further work is needed to address potential limitations and scale the approaches to more complex, real-world settings. Nonetheless, this work represents an important contribution towards realizing the full potential of human-AI teams.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Learning to Complement and to Defer to Multiple Users

Zheng Zhang, Wenjie Ai, Kevin Wells, David Rosewarne, Thanh-Toan Do, Gustavo Carneiro

With the development of Human-AI Collaboration in Classification (HAI-CC), integrating users and AI predictions becomes challenging due to the complex decision-making process. This process has three options: 1) AI autonomously classifies, 2) learning to complement, where AI collaborates with users, and 3) learning to defer, where AI defers to users. Despite their interconnected nature, these options have been studied in isolation rather than as components of a unified system. In this paper, we address this weakness with the novel HAI-CC methodology, called Learning to Complement and to Defer to Multiple Users (LECODU). LECODU not only combines learning to complement and learning to defer strategies, but it also incorporates an estimation of the optimal number of users to engage in the decision process. The training of LECODU maximises classification accuracy and minimises collaboration costs associated with user involvement. Comprehensive evaluations across real-world and synthesized datasets demonstrate LECODU's superior performance compared to state-of-the-art HAI-CC methods. Remarkably, even when relying on unreliable users with high rates of label noise, LECODU exhibits significant improvement over both human decision-makers alone and AI alone.

7/10/2024

🤯

Learning to Complement with Multiple Humans

Zheng Zhang, Cuong Nguyen, Kevin Wells, Thanh-Toan Do, Gustavo Carneiro

Real-world image classification tasks tend to be complex, where expert labellers are sometimes unsure about the classes present in the images, leading to the issue of learning with noisy labels (LNL). The ill-posedness of the LNL task requires the adoption of strong assumptions or the use of multiple noisy labels per training image, resulting in accurate models that work well in isolation but fail to optimise human-AI collaborative classification (HAI-CC). Unlike such LNL methods, HAI-CC aims to leverage the synergies between human expertise and AI capabilities but requires clean training labels, limiting its real-world applicability. This paper addresses this gap by introducing the innovative Learning to Complement with Multiple Humans (LECOMH) approach. LECOMH is designed to learn from noisy labels without depending on clean labels, simultaneously maximising collaborative accuracy while minimising the cost of human collaboration, measured by the number of human expert annotations required per image. Additionally, new benchmarks featuring multiple noisy labels for both training and testing are proposed to evaluate HAI-CC methods. Through quantitative comparisons on these benchmarks, LECOMH consistently outperforms competitive HAI-CC approaches, human labellers, multi-rater learning, and noisy-label learning methods across various datasets, offering a promising solution for addressing real-world image classification challenges.

5/2/2024

Cost-Sensitive Learning to Defer to Multiple Experts with Workload Constraints

Jean V. Alves, Diogo Leit~ao, S'ergio Jesus, Marco O. P. Sampaio, Javier Li'ebana, Pedro Saleiro, M'ario A. T. Figueiredo, Pedro Bizarro

Learning to defer (L2D) aims to improve human-AI collaboration systems by learning how to defer decisions to humans when they are more likely to be correct than an ML classifier. Existing research in L2D overlooks key real-world aspects that impede its practical adoption, namely: i) neglecting cost-sensitive scenarios, where type I and type II errors have different costs; ii) requiring concurrent human predictions for every instance of the training dataset; and iii) not dealing with human work-capacity constraints. To address these issues, we propose the textit{deferral under cost and capacity constraints framework} (DeCCaF). DeCCaF is a novel L2D approach, employing supervised learning to model the probability of human error under less restrictive data requirements (only one expert prediction per instance) and using constraint programming to globally minimize the error cost, subject to workload limitations. We test DeCCaF in a series of cost-sensitive fraud detection scenarios with different teams of 9 synthetic fraud analysts, with individual work-capacity constraints. The results demonstrate that our approach performs significantly better than the baselines in a wide array of scenarios, achieving an average $8.4%$ reduction in the misclassification cost. The code used for the experiments is available at https://github.com/feedzai/deccaf

8/21/2024

📈

Learning to Defer in Content Moderation: The Human-AI Interplay

Thodoris Lykouris, Wentao Weng

Successful content moderation in online platforms relies on a human-AI collaboration approach. A typical heuristic estimates the expected harmfulness of a post and uses fixed thresholds to decide whether to remove it and whether to send it for human review. This disregards the prediction uncertainty, the time-varying element of human review capacity and post arrivals, and the selective sampling in the dataset (humans only review posts filtered by the admission algorithm). In this paper, we introduce a model to capture the human-AI interplay in content moderation. The algorithm observes contextual information for incoming posts, makes classification and admission decisions, and schedules posts for human review. Only admitted posts receive human reviews on their harmfulness. These reviews help educate the machine-learning algorithms but are delayed due to congestion in the human review system. The classical learning-theoretic way to capture this human-AI interplay is via the framework of learning to defer, where the algorithm has the option to defer a classification task to humans for a fixed cost and immediately receive feedback. Our model contributes to this literature by introducing congestion in the human review system. Moreover, unlike work on online learning with delayed feedback where the delay in the feedback is exogenous to the algorithm's decisions, the delay in our model is endogenous to both the admission and the scheduling decisions. We propose a near-optimal learning algorithm that carefully balances the classification loss from a selectively sampled dataset, the idiosyncratic loss of non-reviewed posts, and the delay loss of having congestion in the human review system. To the best of our knowledge, this is the first result for online learning in contextual queueing systems and hence our analytical framework may be of independent interest.

6/4/2024