PASS: Peer-Agreement based Sample Selection for training with Noisy Labels

2303.10802

Published 5/1/2024 by Arpit Garg, Cuong Nguyen, Rafael Felix, Thanh-Toan Do, Gustavo Carneiro

🏋️

Abstract

The prevalence of noisy-label samples poses a significant challenge in deep learning, inducing overfitting effects. This has, therefore, motivated the emergence of learning with noisy-label (LNL) techniques that focus on separating noisy- and clean-label samples to apply different learning strategies to each group of samples. Current methodologies often rely on the small-loss hypothesis or feature-based selection to separate noisy- and clean-label samples, yet our empirical observations reveal their limitations, especially for labels with instance dependent noise (IDN). An important characteristic of IDN is the difficulty to distinguish the clean-label samples that lie near the decision boundary (i.e., the hard samples) from the noisy-label samples. We, therefore, propose a new noisy-label detection method, termed Peer-Agreement based Sample Selection (PASS), to address this problem. Utilising a trio of classifiers, PASS employs consensus-driven peer-based agreement of two models to select the samples to train the remaining model. PASS is easily integrated into existing LNL models, enabling the improvement of the detection accuracy of noisy- and clean-label samples, which increases the classification accuracy across various LNL benchmarks.

Create account to get full access

Overview

Deep learning models can struggle with noisy or unreliable training data, leading to overfitting.
Techniques for learning with noisy labels (LNL) aim to separate noisy and clean samples to apply different training strategies.
Existing methods often rely on the small-loss hypothesis or feature-based selection, but have limitations with instance-dependent noise (IDN).
IDN makes it difficult to distinguish clean samples near the decision boundary (hard samples) from noisy samples.
This paper proposes a new method called Peer-Agreement based Sample Selection (PASS) to better detect noisy and clean samples, especially for IDN.

Plain English Explanation

Deep learning models can perform very well, but they require large amounts of high-quality training data. Unfortunately, in the real world, the data we have is often messy and unreliable, with some samples having incorrect or "noisy" labels. This can cause the model to overfit to the noisy data, reducing its overall performance.

To address this challenge, researchers have developed learning with noisy labels (LNL) techniques. The goal of LNL is to identify which samples have noisy labels, and then apply different training strategies to the noisy and clean samples.

Current LNL methods often rely on the "small-loss hypothesis" or feature-based selection to try to separate the noisy and clean samples. However, the authors of this paper found that these approaches have limitations, especially when dealing with a type of noise called "instance-dependent noise" (IDN).

With IDN, it can be very difficult to distinguish clean samples that are close to the decision boundary (the "hard" samples) from samples with noisy labels. This is a problem because those hard, clean samples are actually very important for the model to learn from.

To address this issue, the authors propose a new method called "Peer-Agreement based Sample Selection" (PASS). PASS uses a trio of classifiers to reach a consensus on which samples are noisy and which are clean, focusing on the hard samples near the decision boundary. This allows LNL models to better identify and handle the noisy data, improving the overall classification accuracy.

Technical Explanation

The key innovation in this paper is the PASS method for detecting noisy and clean labels, especially in the presence of instance-dependent noise (IDN). IDN is a challenging type of noise where the probability of a label being noisy depends on the specific instance or sample.

The PASS method works as follows:

Train three separate classifier models on the training data.
Have the three models classify each sample and compare their predictions. Samples where the models agree are considered "clean", while samples with disagreement are considered "noisy".
Use the clean samples to train a final, improved model.

This peer-based consensus approach is effective at identifying hard, clean samples that lie near the decision boundary, which are difficult for other LNL methods to detect correctly. By accurately separating the noisy and clean samples, PASS enables LNL models to apply specialized training strategies for each group, leading to improved overall performance.

The paper evaluates PASS on several standard LNL benchmarks, showing that it outperforms other state-of-the-art noisy label detection methods, especially in scenarios with IDN. The authors also provide ablation studies and analyses to demonstrate the key factors contributing to PASS's effectiveness.

Critical Analysis

One potential limitation of the PASS method is that it requires training three separate classifier models, which could be computationally expensive compared to single-model approaches. The authors do note that PASS can be easily integrated into existing LNL models, but the additional training burden may be a practical concern in some applications.

Additionally, the paper focuses on evaluating PASS on standard LNL benchmarks, but does not explore its performance on real-world, large-scale datasets with more complex noise patterns. Further research may be needed to understand how well PASS scales and generalizes to more diverse and challenging noisy label scenarios.

While the authors demonstrate the advantages of PASS over other noisy label detection methods, they do not provide a deep, theoretical analysis of why the peer-based consensus approach is effective. A more rigorous mathematical understanding of the method's strengths and weaknesses could lead to further improvements or inspire the development of alternative techniques.

Overall, the PASS method presents a promising approach for dealing with instance-dependent noisy labels, but additional research may be needed to fully understand its limitations and potential for real-world applications. Readers are encouraged to think critically about the tradeoffs and consider how this work fits into the broader landscape of noisy label learning techniques.

Conclusion

This paper introduces a new noisy label detection method called PASS that leverages the consensus of a trio of classifier models to effectively separate noisy and clean samples, especially in the presence of instance-dependent noise. By accurately identifying hard, clean samples near the decision boundary, PASS enables learning with noisy labels (LNL) techniques to apply specialized training strategies and improve overall classification performance.

The PASS method represents a valuable contribution to the field of noisy label learning, addressing a key limitation of existing approaches. As deep learning continues to be applied to real-world, messy datasets, techniques like PASS will become increasingly important for building robust and reliable models. Further research may explore how PASS scales to larger datasets and more complex noise patterns, as well as develop a deeper theoretical understanding of its strengths and weaknesses.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

📈

Instance-dependent Noisy-label Learning with Graphical Model Based Noise-rate Estimation

Arpit Garg, Cuong Nguyen, Rafael Felix, Thanh-Toan Do, Gustavo Carneiro

Deep learning faces a formidable challenge when handling noisy labels, as models tend to overfit samples affected by label noise. This challenge is further compounded by the presence of instance-dependent noise (IDN), a realistic form of label noise arising from ambiguous sample information. To address IDN, Label Noise Learning (LNL) incorporates a sample selection stage to differentiate clean and noisy-label samples. This stage uses an arbitrary criterion and a pre-defined curriculum that initially selects most samples as noisy and gradually decreases this selection rate during training. Such curriculum is sub-optimal since it does not consider the actual label noise rate in the training set. This paper addresses this issue with a new noise-rate estimation method that is easily integrated with most state-of-the-art (SOTA) LNL methods to produce a more effective curriculum. Synthetic and real-world benchmark results demonstrate that integrating our approach with SOTA LNL methods improves accuracy in most cases.

5/1/2024

cs.CV

Jump-teaching: Ultra Efficient and Robust Learning with Noisy Label

Kangye Ji, Fei Cheng, Zeqing Wang, Bohu Huang

Sample selection is the most straightforward technique to combat label noise, aiming to distinguish mislabeled samples during training and avoid the degradation of the robustness of the model. In the workflow, $textit{selecting possibly clean data}$ and $textit{model update}$ are iterative. However, their interplay and intrinsic characteristics hinder the robustness and efficiency of learning with noisy labels: 1) The model chooses clean data with selection bias, leading to the accumulated error in the model update. 2) Most selection strategies leverage partner networks or supplementary information to mitigate label corruption, albeit with increased computation resources and lower throughput speed. Therefore, we employ only one network with the jump manner update to decouple the interplay and mine more semantic information from the loss for a more precise selection. Specifically, the selection of clean data for each model update is based on one of the prior models, excluding the last iteration. The strategy of model update exhibits a jump behavior in the form. Moreover, we map the outputs of the network and labels into the same semantic feature space, respectively. In this space, a detailed and simple loss distribution is generated to distinguish clean samples more effectively. Our proposed approach achieves almost up to $2.53times$ speedup, $0.46times$ peak memory footprint, and superior robustness over state-of-the-art works with various noise settings.

5/29/2024

cs.CV

Robust Noisy Label Learning via Two-Stream Sample Distillation

Sihan Bai, Sanping Zhou, Zheng Qin, Le Wang, Nanning Zheng

Noisy label learning aims to learn robust networks under the supervision of noisy labels, which plays a critical role in deep learning. Existing work either conducts sample selection or label correction to deal with noisy labels during the model training process. In this paper, we design a simple yet effective sample selection framework, termed Two-Stream Sample Distillation (TSSD), for noisy label learning, which can extract more high-quality samples with clean labels to improve the robustness of network training. Firstly, a novel Parallel Sample Division (PSD) module is designed to generate a certain training set with sufficient reliable positive and negative samples by jointly considering the sample structure in feature space and the human prior in loss space. Secondly, a novel Meta Sample Purification (MSP) module is further designed to mine adequate semi-hard samples from the remaining uncertain training set by learning a strong meta classifier with extra golden data. As a result, more and more high-quality samples will be distilled from the noisy training set to train networks robustly in every iteration. Extensive experiments on four benchmark datasets, including CIFAR-10, CIFAR-100, Tiny-ImageNet, and Clothing-1M, show that our method has achieved state-of-the-art results over its competitors.

4/17/2024

cs.CV cs.AI

🔗

Pairwise Similarity Distribution Clustering for Noisy Label Learning

Sihan Bai

Noisy label learning aims to train deep neural networks using a large amount of samples with noisy labels, whose main challenge comes from how to deal with the inaccurate supervision caused by wrong labels. Existing works either take the label correction or sample selection paradigm to involve more samples with accurate labels into the training process. In this paper, we propose a simple yet effective sample selection algorithm, termed as Pairwise Similarity Distribution Clustering~(PSDC), to divide the training samples into one clean set and another noisy set, which can power any of the off-the-shelf semi-supervised learning regimes to further train networks for different downstream tasks. Specifically, we take the pairwise similarity between sample pairs to represent the sample structure, and the Gaussian Mixture Model~(GMM) to model the similarity distribution between sample pairs belonging to the same noisy cluster, therefore each sample can be confidently divided into the clean set or noisy set. Even under severe label noise rate, the resulting data partition mechanism has been proved to be more robust in judging the label confidence in both theory and practice. Experimental results on various benchmark datasets, such as CIFAR-10, CIFAR-100 and Clothing1M, demonstrate significant improvements over state-of-the-art methods.

4/3/2024

cs.LG cs.CV