Mitigating annotation shift in cancer classification using single image generative models

2405.19754

Published 5/31/2024 by Marta Buetas Arcas, Richard Osuala, Karim Lekadir, Oliver D'iaz

Mitigating annotation shift in cancer classification using single image generative models

Abstract

Artificial Intelligence (AI) has emerged as a valuable tool for assisting radiologists in breast cancer detection and diagnosis. However, the success of AI applications in this domain is restricted by the quantity and quality of available data, posing challenges due to limited and costly data annotation procedures that often lead to annotation shifts. This study simulates, analyses and mitigates annotation shifts in cancer classification in the breast mammography domain. First, a high-accuracy cancer risk prediction model is developed, which effectively distinguishes benign from malignant lesions. Next, model performance is used to quantify the impact of annotation shift. We uncover a substantial impact of annotation shift on multiclass classification performance particularly for malignant lesions. We thus propose a training data augmentation approach based on single-image generative models for the affected class, requiring as few as four in-domain annotations to considerably mitigate annotation shift, while also addressing dataset imbalance. Lastly, we further increase performance by proposing and validating an ensemble architecture based on multiple models trained under different data augmentation regimes. Our study offers key insights into annotation shift in deep learning breast cancer classification and explores the potential of single-image generative models to overcome domain shift challenges.

Create account to get full access

Overview

This paper explores the use of single image generative models to mitigate the issue of annotation shift in cancer classification tasks.
Annotation shift refers to the discrepancy between how images are annotated during model training and how they are annotated in real-world deployment scenarios.
The researchers proposed a method to generate plausible, diverse images that capture the distribution of real-world annotations, allowing the model to learn a more robust representation.

Plain English Explanation

In medical imaging, annotating images with labels or descriptions is a crucial step for training AI models to detect and classify diseases like cancer. However, the way images are annotated during model development may not always match how they are annotated in actual clinical practice. This mismatch, known as "annotation shift," can cause the model to perform poorly when deployed in the real world.

To address this challenge, the researchers in this paper explored the use of single image generative models. These are AI systems that can create new, plausible-looking images based on the patterns they learn from a dataset. The idea is to use these generative models to produce synthetic images that better reflect the annotation patterns seen in real-world clinical settings. By training the cancer classification model on this more diverse and realistic set of images, the researchers hoped to improve its robustness and performance when deployed.

Technical Explanation

The key elements of this paper are:

Experiment Design: The researchers used a cancer histopathology dataset and introduced annotation shift by modifying the labels of a subset of the training images. They then trained a cancer classification model on this "shifted" dataset.
Single Image Generative Model: To mitigate the annotation shift, the researchers trained a single image generative model (specifically, a StyleGAN2 model) on the original, unshifted dataset. This model was then used to generate additional, plausible-looking images that matched the original annotation patterns.
Model Training and Evaluation: The researchers trained the cancer classification model on a combination of the original training data and the synthetic images generated by the StyleGAN2 model. They evaluated the model's performance on a held-out test set that reflected the real-world annotation patterns.

The researchers found that incorporating the synthetic images generated by the single image generative model helped improve the cancer classification model's performance compared to training on the shifted dataset alone. This suggests that this approach can be effective in mitigating the negative impact of annotation shift in medical imaging tasks.

Critical Analysis

The researchers acknowledge that their method relies on the assumption that the single image generative model can accurately capture the distribution of real-world annotations. If the generative model fails to produce sufficiently diverse and realistic synthetic images, the benefits of this approach may be limited.

Additionally, the paper does not explore the potential impact of different types or degrees of annotation shift on the effectiveness of this method. It would be valuable to investigate how the proposed approach performs in more complex or challenging annotation shift scenarios.

Further research could also examine the trade-offs between the quality and diversity of the synthetic images generated by the single image generative model and the resulting performance of the cancer classification model.

Conclusion

This paper presents a promising approach to mitigating the impact of annotation shift in cancer classification tasks using single image generative models. By generating synthetic images that better reflect real-world annotation patterns, the researchers were able to improve the robustness and performance of the cancer classification model.

While there are some limitations and areas for further exploration, this research highlights the potential of leveraging generative models to address challenges in medical imaging and AI-based diagnostic systems. As the field of medical AI continues to evolve, addressing issues like annotation shift will be crucial for developing reliable and deployable solutions.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

🤿

BRAIxDet: Learning to Detect Malignant Breast Lesion with Incomplete Annotations

Yuanhong Chen, Yuyuan Liu, Chong Wang, Michael Elliott, Chun Fung Kwok, Carlos Pena-Solorzano, Yu Tian, Fengbei Liu, Helen Frazer, Davis J. McCarthy, Gustavo Carneiro

Methods to detect malignant lesions from screening mammograms are usually trained with fully annotated datasets, where images are labelled with the localisation and classification of cancerous lesions. However, real-world screening mammogram datasets commonly have a subset that is fully annotated and another subset that is weakly annotated with just the global classification (i.e., without lesion localisation). Given the large size of such datasets, researchers usually face a dilemma with the weakly annotated subset: to not use it or to fully annotate it. The first option will reduce detection accuracy because it does not use the whole dataset, and the second option is too expensive given that the annotation needs to be done by expert radiologists. In this paper, we propose a middle-ground solution for the dilemma, which is to formulate the training as a weakly- and semi-supervised learning problem that we refer to as malignant breast lesion detection with incomplete annotations. To address this problem, our new method comprises two stages, namely: 1) pre-training a multi-view mammogram classifier with weak supervision from the whole dataset, and 2) extending the trained classifier to become a multi-view detector that is trained with semi-supervised student-teacher learning, where the training set contains fully and weakly-annotated mammograms. We provide extensive detection results on two real-world screening mammogram datasets containing incomplete annotations, and show that our proposed approach achieves state-of-the-art results in the detection of malignant breast lesions with incomplete annotations.

4/3/2024

cs.CV cs.AI cs.LG

👀

RadEdit: stress-testing biomedical vision models via diffusion image editing

Fernando P'erez-Garc'ia, Sam Bond-Taylor, Pedro P. Sanchez, Boris van Breugel, Daniel C. Castro, Harshita Sharma, Valentina Salvatelli, Maria T. A. Wetscherek, Hannah Richardson, Matthew P. Lungren, Aditya Nori, Javier Alvarez-Valle, Ozan Oktay, Maximilian Ilse

Biomedical imaging datasets are often small and biased, meaning that real-world performance of predictive models can be substantially lower than expected from internal testing. This work proposes using generative image editing to simulate dataset shifts and diagnose failure modes of biomedical vision models; this can be used in advance of deployment to assess readiness, potentially reducing cost and patient harm. Existing editing methods can produce undesirable changes, with spurious correlations learned due to the co-occurrence of disease and treatment interventions, limiting practical applicability. To address this, we train a text-to-image diffusion model on multiple chest X-ray datasets and introduce a new editing method RadEdit that uses multiple masks, if present, to constrain changes and ensure consistency in the edited images. We consider three types of dataset shifts: acquisition shift, manifestation shift, and population shift, and demonstrate that our approach can diagnose failures and quantify model robustness without additional data collection, complementing more qualitative tools for explainable AI.

4/4/2024

cs.CV cs.AI

Annotating Ambiguous Images: General Annotation Strategy for High-Quality Data with Real-World Biomedical Validation

Lars Schmarje, Vasco Grossmann, Claudius Zelenka, Johannes Brunger, Reinhard Koch

In the field of image classification, existing methods often struggle with biased or ambiguous data, a prevalent issue in real-world scenarios. Current strategies, including semi-supervised learning and class blending, offer partial solutions but lack a definitive resolution. Addressing this gap, our paper introduces a novel strategy for generating high-quality labels in challenging datasets. Central to our approach is a clearly designed flowchart, based on a broad literature review, which enables the creation of reliable labels. We validate our methodology through a rigorous real-world test case in the biomedical field, specifically in deducing height reduction from vertebral imaging. Our empirical study, leveraging over 250,000 annotations, demonstrates the effectiveness of our strategies decisions compared to their alternatives.

4/30/2024

cs.CV

Enhancing AI Diagnostics: Autonomous Lesion Masking via Semi-Supervised Deep Learning

Ting-Ruen Wei, Michele Hell, Dang Bich Thuy Le, Aren Vierra, Ran Pang, Mahesh Patel, Young Kang, Yuling Yan

This study presents an unsupervised domain adaptation method aimed at autonomously generating image masks outlining regions of interest (ROIs) for differentiating breast lesions in breast ultrasound (US) imaging. Our semi-supervised learning approach utilizes a primitive model trained on a small public breast US dataset with true annotations. This model is then iteratively refined for the domain adaptation task, generating pseudo-masks for our private, unannotated breast US dataset. The dataset, twice the size of the public one, exhibits considerable variability in image acquisition perspectives and demographic representation, posing a domain-shift challenge. Unlike typical domain adversarial training, we employ downstream classification outcomes as a benchmark to guide the updating of pseudo-masks in subsequent iterations. We found the classification precision to be highly correlated with the completeness of the generated ROIs, which promotes the explainability of the deep learning classification model. Preliminary findings demonstrate the efficacy and reliability of this approach in streamlining the ROI annotation process, thereby enhancing the classification and localization of breast lesions for more precise and interpretable diagnoses.

4/22/2024

cs.CV cs.AI cs.LG