Learning to Complement with Multiple Humans

Read original: arXiv:2311.13172 - Published 5/2/2024 by Zheng Zhang, Cuong Nguyen, Kevin Wells, Thanh-Toan Do, Gustavo Carneiro

🤯

Overview

Real-world image classification tasks can be complex, with experts sometimes unsure about the correct labels for images.
This leads to the issue of learning with noisy labels (LNL), where the training data has incorrect or ambiguous labels.
Existing LNL methods make strong assumptions or use multiple noisy labels per image, resulting in models that work well in isolation but fail to optimize human-AI collaborative classification (HAI-CC).
HAI-CC aims to leverage the strengths of both human experts and AI, but requires clean training labels, limiting its real-world applicability.

Plain English Explanation

The paper addresses the challenge of image classification in real-world scenarios. In these situations, the labels (the information about what's in the images) provided by human experts can be noisy or uncertain. This makes it difficult for AI systems to learn accurately from the data.

Existing approaches to dealing with noisy labels either make strong assumptions or require multiple labels per image, which can work well in controlled settings but don't translate well to real-world human-AI collaboration. On the other hand, methods that aim to leverage both human and AI capabilities (HAI-CC) need clean training data, which is often not available in practice.

The paper introduces a new approach called LECOMH, which is designed to learn from noisy labels without relying on clean labels. LECOMH aims to maximize the accuracy of the collaborative classification while minimizing the cost of human involvement, as measured by the number of expert annotations required per image.

Technical Explanation

The paper proposes the LECOMH (Learning to Complement with Multiple Humans) approach to address the limitations of existing methods for learning with noisy labels and human-AI collaboration.

LECOMH is designed to learn from noisy labels without depending on clean labels, which are often difficult to obtain in real-world scenarios. The key idea is to learn a complementary classifier that can leverage the strengths of both human experts and the AI model to achieve high collaborative accuracy, while minimizing the number of human annotations required.

The paper also introduces new benchmarks that feature multiple noisy labels for both training and testing, to better evaluate the performance of HAI-CC methods. Through quantitative comparisons on these benchmarks, the authors show that LECOMH consistently outperforms competitive HAI-CC approaches, human labelers, multi-rater learning, and noisy-label learning methods across various datasets.

Critical Analysis

The paper addresses an important and practical problem in the field of multi-label continual learning for medical applications, where the availability of clean training data can be limited. The introduction of the LECOMH approach and the new benchmarks are valuable contributions to the field.

One potential limitation of the research is that it focuses on image classification tasks, and the performance of LECOMH on other types of dataset cleansing tasks remains to be seen. Additionally, the paper does not provide a detailed analysis of the computational complexity or training time of the LECOMH approach, which could be an important consideration for real-world deployment.

Further research could explore the applicability of LECOMH to other domains, such as natural language processing or time series analysis, and investigate its performance on more diverse datasets and tasks. Researchers could also compare LECOMH to a wider range of baselines, including more recent approaches for learning with noisy labels and human-AI collaboration.

Conclusion

This paper presents a novel approach, LECOMH, that addresses the challenge of learning with noisy labels in real-world image classification tasks. LECOMH aims to leverage the complementary strengths of human experts and AI models to achieve high collaborative accuracy while minimizing the cost of human involvement.

The introduction of new benchmarks that feature multiple noisy labels for both training and testing, and the consistent performance improvements of LECOMH over competitive methods, make this research a promising step towards more robust and practical solutions for real-world image classification challenges.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🤯

Learning to Complement with Multiple Humans

Zheng Zhang, Cuong Nguyen, Kevin Wells, Thanh-Toan Do, Gustavo Carneiro

Real-world image classification tasks tend to be complex, where expert labellers are sometimes unsure about the classes present in the images, leading to the issue of learning with noisy labels (LNL). The ill-posedness of the LNL task requires the adoption of strong assumptions or the use of multiple noisy labels per training image, resulting in accurate models that work well in isolation but fail to optimise human-AI collaborative classification (HAI-CC). Unlike such LNL methods, HAI-CC aims to leverage the synergies between human expertise and AI capabilities but requires clean training labels, limiting its real-world applicability. This paper addresses this gap by introducing the innovative Learning to Complement with Multiple Humans (LECOMH) approach. LECOMH is designed to learn from noisy labels without depending on clean labels, simultaneously maximising collaborative accuracy while minimising the cost of human collaboration, measured by the number of human expert annotations required per image. Additionally, new benchmarks featuring multiple noisy labels for both training and testing are proposed to evaluate HAI-CC methods. Through quantitative comparisons on these benchmarks, LECOMH consistently outperforms competitive HAI-CC approaches, human labellers, multi-rater learning, and noisy-label learning methods across various datasets, offering a promising solution for addressing real-world image classification challenges.

5/2/2024

Learning to Complement and to Defer to Multiple Users

Zheng Zhang, Wenjie Ai, Kevin Wells, David Rosewarne, Thanh-Toan Do, Gustavo Carneiro

With the development of Human-AI Collaboration in Classification (HAI-CC), integrating users and AI predictions becomes challenging due to the complex decision-making process. This process has three options: 1) AI autonomously classifies, 2) learning to complement, where AI collaborates with users, and 3) learning to defer, where AI defers to users. Despite their interconnected nature, these options have been studied in isolation rather than as components of a unified system. In this paper, we address this weakness with the novel HAI-CC methodology, called Learning to Complement and to Defer to Multiple Users (LECODU). LECODU not only combines learning to complement and learning to defer strategies, but it also incorporates an estimation of the optimal number of users to engage in the decision process. The training of LECODU maximises classification accuracy and minimises collaboration costs associated with user involvement. Comprehensive evaluations across real-world and synthesized datasets demonstrate LECODU's superior performance compared to state-of-the-art HAI-CC methods. Remarkably, even when relying on unreliable users with high rates of label noise, LECODU exhibits significant improvement over both human decision-makers alone and AI alone.

7/10/2024

🔗

CLImage: Human-Annotated Datasets for Complementary-Label Learning

Hsiu-Hsuan Wang, Tan-Ha Mai, Nai-Xuan Ye, Wei-I Lin, Hsuan-Tien Lin

Complementary-label learning (CLL) is a weakly-supervised learning paradigm that aims to train a multi-class classifier using only complementary labels, which indicate classes to which an instance does not belong. Despite numerous algorithmic proposals for CLL, their practical applicability remains unverified for two reasons. Firstly, these algorithms often rely on assumptions about the generation of complementary labels, and it is not clear how far the assumptions are from reality. Secondly, their evaluation has been limited to synthetic datasets. To gain insights into the real-world performance of CLL algorithms, we developed a protocol to collect complementary labels from human annotators. Our efforts resulted in the creation of four datasets: CLCIFAR10, CLCIFAR20, CLMicroImageNet10, and CLMicroImageNet20, derived from well-known classification datasets CIFAR10, CIFAR100, and TinyImageNet200. These datasets represent the very first real-world CLL datasets. Through extensive benchmark experiments, we discovered a notable decrease in performance when transitioning from synthetic datasets to real-world datasets. We investigated the key factors contributing to the decrease with a thorough dataset-level ablation study. Our analyses highlight annotation noise as the most influential factor in the real-world datasets. In addition, we discover that the biased-nature of human-annotated complementary labels and the difficulty to validate with only complementary labels are two outstanding barriers to practical CLL. These findings suggest that the community focus more research efforts on developing CLL algorithms and validation schemes that are robust to noisy and biased complementary-label distributions.

6/26/2024

💬

Leveraging Large Language Models (LLMs) to Support Collaborative Human-AI Online Risk Data Annotation

Jinkyung Park, Pamela Wisniewski, Vivek Singh

In this position paper, we discuss the potential for leveraging LLMs as interactive research tools to facilitate collaboration between human coders and AI to effectively annotate online risk data at scale. Collaborative human-AI labeling is a promising approach to annotating large-scale and complex data for various tasks. Yet, tools and methods to support effective human-AI collaboration for data annotation are under-studied. This gap is pertinent because co-labeling tasks need to support a two-way interactive discussion that can add nuance and context, particularly in the context of online risk, which is highly subjective and contextualized. Therefore, we provide some of the early benefits and challenges of using LLMs-based tools for risk annotation and suggest future directions for the HCI research community to leverage LLMs as research tools to facilitate human-AI collaboration in contextualized online data annotation. Our research interests align very well with the purposes of the LLMs as Research Tools workshop to identify ongoing applications and challenges of using LLMs to work with data in HCI research. We anticipate learning valuable insights from organizers and participants into how LLMs can help reshape the HCI community's methods for working with data.

4/12/2024