Contrastive-Based Deep Embeddings for Label Noise-Resilient Histopathology Image Classification

2404.07605

Published 4/12/2024 by Lucas Dedieu, Nicolas Nerrienet, Adrien Nivaggioli, Clara Simmat, Marceau Clavel, Arnaud Gauthier, St'ephane Sockeel, R'emy Peyret

cs.CV cs.AI

Contrastive-Based Deep Embeddings for Label Noise-Resilient Histopathology Image Classification

Abstract

Recent advancements in deep learning have proven highly effective in medical image classification, notably within histopathology. However, noisy labels represent a critical challenge in histopathology image classification, where accurate annotations are vital for training robust deep learning models. Indeed, deep neural networks can easily overfit label noise, leading to severe degradations in model performance. While numerous public pathology foundation models have emerged recently, none have evaluated their resilience to label noise. Through thorough empirical analyses across multiple datasets, we exhibit the label noise resilience property of embeddings extracted from foundation models trained in a self-supervised contrastive manner. We demonstrate that training with such embeddings substantially enhances label noise robustness when compared to non-contrastive-based ones as well as commonly used noise-resilient methods. Our results unequivocally underline the superiority of contrastive learning in effectively mitigating the label noise challenge. Code is publicly available at https://github.com/LucasDedieu/NoiseResilientHistopathology.

Create account to get full access

Overview

This paper presents a novel contrastive-based deep learning approach for histopathology image classification that is resilient to label noise.
The proposed method, called Contrastive-Based Deep Embeddings (CBDE), leverages contrastive learning to learn robust image representations that are less affected by noisy labels.
The authors evaluate CBDE on several histopathology datasets with varying levels of label noise and compare it to other state-of-the-art techniques for learning with noisy labels.

Plain English Explanation

In medical imaging, sometimes the labels (e.g., disease diagnosis) associated with the images can be inaccurate or unreliable. This is known as "label noise," and it can severely degrade the performance of machine learning models trained on this data.

The researchers in this paper developed a new deep learning approach called Contrastive-Based Deep Embeddings (CBDE) that is more resilient to label noise. CBDE works by learning robust image representations through a process called contrastive learning.

Contrastive learning encourages the model to learn features that are similar for images with the same true label, and different for images with different true labels, even if the provided labels are noisy. This helps the model focus on the underlying visual patterns rather than just memorizing the potentially unreliable labels.

The authors show that CBDE outperforms other state-of-the-art methods for learning with noisy labels on several histopathology image classification tasks, demonstrating its effectiveness in real-world medical imaging applications where label noise is common.

Technical Explanation

The key technical innovation of this paper is the Contrastive-Based Deep Embeddings (CBDE) approach, which consists of two main components:

Contrastive Learning Module: This module learns robust image representations by maximizing the similarity between images with the same true label and minimizing the similarity between images with different true labels, even in the presence of noisy labels.
Noise-Aware Classification Module: This module takes the learned embeddings from the contrastive learning module and uses them to perform the final image classification task, while also accounting for the potential label noise.

The authors evaluate CBDE on several histopathology image classification datasets, including Camelyon16, PatchCamelyon, and an in-house dataset. They introduce various levels of synthetic label noise to these datasets and compare the performance of CBDE to other state-of-the-art methods for learning with noisy labels.

The results show that CBDE significantly outperforms the competing methods, demonstrating its effectiveness in learning robust image representations that are resilient to label noise in histopathology image classification tasks.

Critical Analysis

One potential limitation of this work is that the authors only consider synthetic label noise, which may not fully capture the complexities of real-world label noise encountered in medical imaging. It would be valuable to evaluate CBDE on more diverse, real-world histopathology datasets with naturally occurring label noise.

Additionally, the authors do not provide a detailed analysis of the types of errors or misclassifications made by CBDE compared to the other methods. A more in-depth error analysis could yield additional insights into the strengths and weaknesses of the proposed approach.

That said, the authors do acknowledge these limitations and suggest future research directions, such as exploring the use of meta-learning or Bayesian techniques to further improve the robustness of CBDE to different types of label noise.

Conclusion

This paper presents a novel contrastive-based deep learning approach, called Contrastive-Based Deep Embeddings (CBDE), that is designed to be resilient to label noise in histopathology image classification tasks. The key innovation is the use of contrastive learning to learn robust image representations that are less affected by inaccurate or unreliable labels.

The authors demonstrate the effectiveness of CBDE through extensive experiments on several histopathology datasets, showing that it outperforms other state-of-the-art methods for learning with noisy labels. This work has important implications for improving the reliability and robustness of medical image analysis systems, which are crucial for clinical decision-making and patient care.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Noisy Label Processing for Classification: A Survey

Mengting Li, Chuang Zhu

In recent years, deep neural networks (DNNs) have gained remarkable achievement in computer vision tasks, and the success of DNNs often depends greatly on the richness of data. However, the acquisition process of data and high-quality ground truth requires a lot of manpower and money. In the long, tedious process of data annotation, annotators are prone to make mistakes, resulting in incorrect labels of images, i.e., noisy labels. The emergence of noisy labels is inevitable. Moreover, since research shows that DNNs can easily fit noisy labels, the existence of noisy labels will cause significant damage to the model training process. Therefore, it is crucial to combat noisy labels for computer vision tasks, especially for classification tasks. In this survey, we first comprehensively review the evolution of different deep learning approaches for noisy label combating in the image classification task. In addition, we also review different noise patterns that have been proposed to design robust algorithms. Furthermore, we explore the inner pattern of real-world label noise and propose an algorithm to generate a synthetic label noise pattern guided by real-world data. We test the algorithm on the well-known real-world dataset CIFAR-10N to form a new real-world data-guided synthetic benchmark and evaluate some typical noise-robust methods on the benchmark.

4/8/2024

cs.CV cs.AI

🏷️

Classification of Breast Cancer Histopathology Images using a Modified Supervised Contrastive Learning Method

Matina Mahdizadeh Sani, Ali Royat, Mahdieh Soleymani Baghshah

Deep neural networks have reached remarkable achievements in medical image processing tasks, specifically classifying and detecting various diseases. However, when confronted with limited data, these networks face a critical vulnerability, often succumbing to overfitting by excessively memorizing the limited information available. This work addresses the challenge mentioned above by improving the supervised contrastive learning method to reduce the impact of false positives. Unlike most existing methods that rely predominantly on fully supervised learning, our approach leverages the advantages of self-supervised learning in conjunction with employing the available labeled data. We evaluate our method on the BreakHis dataset, which consists of breast cancer histopathology images, and demonstrate an increase in classification accuracy by 1.45% at the image level and 1.42% at the patient level compared to the state-of-the-art method. This improvement corresponds to 93.63% absolute accuracy, highlighting our approach's effectiveness in leveraging data properties to learn more appropriate representation space.

5/7/2024

cs.CV cs.LG

Self-Contrastive Weakly Supervised Learning Framework for Prognostic Prediction Using Whole Slide Images

Saul Fuster, Farbod Khoraminia, Julio Silva-Rodr'iguez, Umay Kiraz, Geert J. L. H. van Leenders, Trygve Eftest{o}l, Valery Naranjo, Emiel A. M. Janssen, Tahlita C. M. Zuiverloon, Kjersti Engan

We present a pioneering investigation into the application of deep learning techniques to analyze histopathological images for addressing the substantial challenge of automated prognostic prediction. Prognostic prediction poses a unique challenge as the ground truth labels are inherently weak, and the model must anticipate future events that are not directly observable in the image. To address this challenge, we propose a novel three-part framework comprising of a convolutional network based tissue segmentation algorithm for region of interest delineation, a contrastive learning module for feature extraction, and a nested multiple instance learning classification module. Our study explores the significance of various regions of interest within the histopathological slides and exploits diverse learning scenarios. The pipeline is initially validated on artificially generated data and a simpler diagnostic task. Transitioning to prognostic prediction, tasks become more challenging. Employing bladder cancer as use case, our best models yield an AUC of 0.721 and 0.678 for recurrence and treatment outcome prediction respectively.

5/27/2024

cs.CV cs.AI

Exploring Beyond Logits: Hierarchical Dynamic Labeling Based on Embeddings for Semi-Supervised Classification

Yanbiao Ma, Licheng Jiao, Fang Liu, Lingling Li, Shuyuan Yang, Xu Liu

In semi-supervised learning, methods that rely on confidence learning to generate pseudo-labels have been widely proposed. However, increasing research finds that when faced with noisy and biased data, the model's representation network is more reliable than the classification network. Additionally, label generation methods based on model predictions often show poor adaptability across different datasets, necessitating customization of the classification network. Therefore, we propose a Hierarchical Dynamic Labeling (HDL) algorithm that does not depend on model predictions and utilizes image embeddings to generate sample labels. We also introduce an adaptive method for selecting hyperparameters in HDL, enhancing its versatility. Moreover, HDL can be combined with general image encoders (e.g., CLIP) to serve as a fundamental data processing module. We extract embeddings from datasets with class-balanced and long-tailed distributions using pre-trained semi-supervised models. Subsequently, samples are re-labeled using HDL, and the re-labeled samples are used to further train the semi-supervised models. Experiments demonstrate improved model performance, validating the motivation that representation networks are more reliable than classifiers or predictors. Our approach has the potential to change the paradigm of pseudo-label generation in semi-supervised learning.

4/29/2024

cs.CV cs.AI