Estimating Noisy Class Posterior with Part-level Labels for Noisy Label Learning

2405.05714

Published 5/10/2024 by Rui Zhao, Bin Shi, Jianfei Ruan, Tianze Pan, Bo Dong

Estimating Noisy Class Posterior with Part-level Labels for Noisy Label Learning

Abstract

In noisy label learning, estimating noisy class posteriors plays a fundamental role for developing consistent classifiers, as it forms the basis for estimating clean class posteriors and the transition matrix. Existing methods typically learn noisy class posteriors by training a classification model with noisy labels. However, when labels are incorrect, these models may be misled to overemphasize the feature parts that do not reflect the instance characteristics, resulting in significant errors in estimating noisy class posteriors. To address this issue, this paper proposes to augment the supervised information with part-level labels, encouraging the model to focus on and integrate richer information from various parts. Specifically, our method first partitions features into distinct parts by cropping instances, yielding part-level labels associated with these various parts. Subsequently, we introduce a novel single-to-multiple transition matrix to model the relationship between the noisy and part-level labels, which incorporates part-level labels into a classifier-consistent framework. Utilizing this framework with part-level labels, we can learn the noisy class posteriors more precisely by guiding the model to integrate information from various parts, ultimately improving the classification performance. Our method is theoretically sound, while experiments show that it is empirically effective in synthetic and real-world noisy benchmarks.

Create account to get full access

Overview

This paper proposes a new method for estimating noisy class posterior probabilities using part-level labels in noisy label learning scenarios.
The key idea is to leverage information from part-level annotations to better estimate the true class probabilities, even when the overall labels are noisy.
The authors demonstrate the effectiveness of their approach on several benchmark datasets, showing improved performance over existing noisy label learning techniques.

Plain English Explanation

In machine learning, it's common to work with datasets where the labels (the information we're trying to predict) may contain errors or noise. This can happen for a variety of reasons, like mistakes in data collection or annotation. <a href="https://aimodels.fyi/papers/arxiv/pairwise-similarity-distribution-clustering-noisy-label-learning">Noisy label learning</a> is the field that studies how to train effective models even when the labels are imperfect.

The authors of this paper introduce a new approach for noisy label learning that leverages additional information beyond just the overall labels. Specifically, they assume that in addition to the noisy labels, we also have access to "part-level" labels - annotations of specific parts or components within each example. These part-level labels are often cleaner and more reliable than the overall labels.

The key insight is that by modeling the relationship between the part-level labels and the true class probabilities, we can better estimate what the true labels should be, even when the observed labels are noisy. This allows the model to learn more effectively despite the imperfect training data.

The authors demonstrate their approach on several benchmark datasets and show that it outperforms existing noisy label learning methods. This suggests that incorporating part-level information can be a powerful tool for dealing with noisy labels in real-world machine learning applications.

Technical Explanation

The core of the proposed method is a Bayesian model that estimates the true class posterior probabilities using both the noisy overall labels and the cleaner part-level labels. Specifically, the model learns a mapping from the part-level labels to the true class probabilities, and then uses this to correct the noise in the overall labels.

Mathematically, the model works as follows. Let $y$ be the noisy overall label, $z$ be the part-level labels, and $\pi$ be the true class probabilities. The key insight is that by modeling the conditional distribution $p(z|\pi)$, we can infer $\pi$ even when $y$ is noisy.

The authors propose an EM-based algorithm to learn the model parameters. In the E-step, they estimate the posterior $p(\pi|y,z)$ using Bayes' rule. In the M-step, they update the model parameters to maximize the likelihood of the observed data.

Experiments on benchmark datasets like <a href="https://aimodels.fyi/papers/arxiv/extracting-clean-balanced-subset-noisy-long-tailed">Clothing1M</a> and Webvision show that this approach outperforms prior noisy label learning methods like <a href="https://aimodels.fyi/papers/arxiv/trusted-multi-view-learning-label-noise">Co-teaching</a> and <a href="https://aimodels.fyi/papers/arxiv/coordinated-sparse-recovery-label-noise">Crust</a>. The authors attribute this to the ability to better estimate the true class probabilities by leveraging the part-level labels.

Critical Analysis

The proposed method makes the strong assumption that part-level labels are available and relatively clean. In many real-world scenarios, obtaining such granular annotations may be difficult or expensive. The authors do not thoroughly explore the sensitivity of their approach to the quality and completeness of the part-level labels.

Additionally, the model introduces several hyperparameters that need to be carefully tuned, such as the number of part-level labels and the tradeoff between part-level and overall label information. The authors provide limited guidance on how to set these hyperparameters in practice.

While the experimental results are promising, it would be valuable to see the method applied to a wider range of datasets and noisy label scenarios. The authors focus primarily on image classification tasks, but the approach may have different strengths and limitations when applied to other domains like text or speech.

Conclusion

This paper presents a novel approach for noisy label learning that leverages part-level labels to better estimate the true class posterior probabilities. By modeling the relationship between part-level annotations and the underlying class distribution, the authors demonstrate improved performance over existing noisy label learning techniques.

The key contribution is the insight that incorporating additional granular information, even if imperfect, can help overcome the challenges posed by noisy overall labels. This suggests that exploring hybrid approaches that combine coarse-grained and fine-grained supervision may be a fruitful direction for future research in noisy label learning and other machine learning scenarios with imperfect data.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

🌿

Pseudo-labelling meets Label Smoothing for Noisy Partial Label Learning

Darshana Saravanan, Naresh Manwani, Vineet Gandhi

Partial label learning (PLL) is a weakly-supervised learning paradigm where each training instance is paired with a set of candidate labels (partial label), one of which is the true label. Noisy PLL (NPLL) relaxes this constraint by allowing some partial labels to not contain the true label, enhancing the practicality of the problem. Our work centres on NPLL and presents a minimalistic framework that initially assigns pseudo-labels to images by exploiting the noisy partial labels through a weighted nearest neighbour algorithm. These pseudo-label and image pairs are then used to train a deep neural network classifier with label smoothing. The classifier's features and predictions are subsequently employed to refine and enhance the accuracy of pseudo-labels. We perform thorough experiments on seven datasets and compare against nine NPLL and PLL methods. We achieve state-of-the-art results in all studied settings from the prior literature, obtaining substantial gains in fine-grained classification and extreme noise scenarios. Further, we show the promising generalisation capability of our framework in realistic crowd-sourced datasets.

5/29/2024

cs.CV cs.LG

🤯

High-dimensional Learning with Noisy Labels

Aymane El Firdoussi, Mohamed El Amine Seddik

This paper provides theoretical insights into high-dimensional binary classification with class-conditional noisy labels. Specifically, we study the behavior of a linear classifier with a label noisiness aware loss function, when both the dimension of data $p$ and the sample size $n$ are large and comparable. Relying on random matrix theory by supposing a Gaussian mixture data model, the performance of the linear classifier when $p,nto infty$ is shown to converge towards a limit, involving scalar statistics of the data. Importantly, our findings show that the low-dimensional intuitions to handle label noise do not hold in high-dimension, in the sense that the optimal classifier in low-dimension dramatically fails in high-dimension. Based on our derivations, we design an optimized method that is shown to be provably more efficient in handling noisy labels in high dimensions. Our theoretical conclusions are further confirmed by experiments on real datasets, where we show that our optimized approach outperforms the considered baselines.

5/24/2024

cs.LG cs.AI stat.ML

Extracting Clean and Balanced Subset for Noisy Long-tailed Classification

Zhuo Li, He Zhao, Zhen Li, Tongliang Liu, Dandan Guo, Xiang Wan

Real-world datasets usually are class-imbalanced and corrupted by label noise. To solve the joint issue of long-tailed distribution and label noise, most previous works usually aim to design a noise detector to distinguish the noisy and clean samples. Despite their effectiveness, they may be limited in handling the joint issue effectively in a unified way. In this work, we develop a novel pseudo labeling method using class prototypes from the perspective of distribution matching, which can be solved with optimal transport (OT). By setting a manually-specific probability measure and using a learned transport plan to pseudo-label the training samples, the proposed method can reduce the side-effects of noisy and long-tailed data simultaneously. Then we introduce a simple yet effective filter criteria by combining the observed labels and pseudo labels to obtain a more balanced and less noisy subset for a robust model training. Extensive experiments demonstrate that our method can extract this class-balanced subset with clean labels, which brings effective performance gains for long-tailed classification with label noise.

4/11/2024

cs.LG

🔗

Pairwise Similarity Distribution Clustering for Noisy Label Learning

Sihan Bai

Noisy label learning aims to train deep neural networks using a large amount of samples with noisy labels, whose main challenge comes from how to deal with the inaccurate supervision caused by wrong labels. Existing works either take the label correction or sample selection paradigm to involve more samples with accurate labels into the training process. In this paper, we propose a simple yet effective sample selection algorithm, termed as Pairwise Similarity Distribution Clustering~(PSDC), to divide the training samples into one clean set and another noisy set, which can power any of the off-the-shelf semi-supervised learning regimes to further train networks for different downstream tasks. Specifically, we take the pairwise similarity between sample pairs to represent the sample structure, and the Gaussian Mixture Model~(GMM) to model the similarity distribution between sample pairs belonging to the same noisy cluster, therefore each sample can be confidently divided into the clean set or noisy set. Even under severe label noise rate, the resulting data partition mechanism has been proved to be more robust in judging the label confidence in both theory and practice. Experimental results on various benchmark datasets, such as CIFAR-10, CIFAR-100 and Clothing1M, demonstrate significant improvements over state-of-the-art methods.

4/3/2024

cs.LG cs.CV