BTMuda: A Bi-level Multi-source unsupervised domain adaptation framework for breast cancer diagnosis

Read original: arXiv:2408.17054 - Published 9/2/2024 by Yuxiang Yang, Xinyi Zeng, Pinxian Zeng, Binyu Yan, Xi Wu, Jiliu Zhou, Yan Wang

🤷

Overview

Deep learning has transformed early detection of breast cancer, leading to reduced mortality rates.
However, difficulties in obtaining annotations and differences between training and real-world data have limited its clinical use.
Unsupervised domain adaptation (UDA) methods can transfer knowledge from labeled sources to unlabeled targets, but struggle with domain shift issues and neglect benefits of multiple relevant sources.
This paper proposes BTMuda, a Bi-level Multi-source UDA method for breast cancer diagnosis, to address these limitations.

Plain English Explanation

Deep learning has been a game-changer in the early detection of breast cancer, leading to a significant drop in mortality rates. However, there are a few challenges that have prevented these techniques from being widely adopted in clinical settings.

One key issue is the difficulty in obtaining the large amounts of labeled data that deep learning models require to perform well. Another problem is that the training data may not fully reflect the real-world conditions the model will encounter in a clinical setting, leading to a "domain shift" problem where the model's performance degrades.

To address these limitations, the researchers used a technique called unsupervised domain adaptation (UDA). The idea behind UDA is to take a model trained on a well-labeled dataset (the "source" domain) and adapt it to work well on a different, unlabeled dataset (the "target" domain) without requiring new annotations. This helps bridge the gap between the training data and the real-world data.

However, existing UDA methods still struggle with the domain shift problem and often fail to take full advantage of having multiple relevant source domains available. To overcome these issues, the researchers developed a new method called BTMuda, which stands for "Bi-level Multi-source unsupervised domain adaptation."

The key innovations in BTMuda are:

A "Three-Branch Mixed" feature extractor that combines a convolutional neural network (CNN) and a transformer to capture both low-level local and high-level global information, reducing the intra-domain shift.
A redesigned transformer architecture with cross-attention and distillation mechanisms to learn domain-invariant representations from multiple source domains, addressing the inter-domain shift.
Alignment modules to further improve the feature and classifier alignment between the source and target domains.

Through extensive experiments on publicly available mammography datasets, the researchers demonstrate that BTMuda outperforms other state-of-the-art UDA methods for breast cancer diagnosis. This work represents an important step forward in making deep learning-based breast cancer detection more practical and accessible for real-world clinical applications.

Technical Explanation

The paper proposes a Bi-level Multi-source unsupervised domain adaptation (BTMuda) method for breast cancer diagnosis. The key components of BTMuda are:

Three-Branch Mixed Feature Extractor: To address the intra-domain shift problem, the method jointly trains a CNN and a transformer as two paths of a domain mixed feature extractor. This allows the model to capture both low-level local and high-level global information, resulting in more robust representations.
Redesigned Transformer Architecture: To handle the inter-domain shift, the researchers redesign the transformer into a three-branch architecture with cross-attention and distillation mechanisms. This enables the model to learn domain-invariant representations by leveraging multiple source domains.
Alignment Modules: BTMuda introduces two alignment modules - one for feature alignment and one for classifier alignment - to further improve the adaptation process between the source and target domains.

The researchers evaluate BTMuda on three public mammographic datasets and demonstrate that it outperforms state-of-the-art UDA methods for breast cancer diagnosis. The paper highlights the importance of addressing both intra-domain and inter-domain shift issues, as well as the benefits of incorporating multiple relevant source domains, to improve the clinical applicability of deep learning-based breast cancer detection.

Critical Analysis

The paper presents a well-designed and comprehensive approach to addressing the limitations of existing UDA methods for breast cancer diagnosis. The researchers' focus on tackling both intra-domain and inter-domain shift issues, as well as their innovative use of a mixed feature extractor and a redesigned transformer architecture, are key strengths of the work.

However, the paper does not provide a thorough discussion of the potential limitations or caveats of the proposed BTMuda method. For example, it would be valuable to understand the computational complexity of the model, the sensitivity to hyperparameter tuning, and the generalizability of the approach to other medical imaging domains beyond breast cancer.

Additionally, the paper could have benefited from a more critical analysis of the experimental results, such as a deeper exploration of failure cases or a comparison to human expert performance on the same tasks. This would help readers better understand the practical implications and remaining challenges in deploying deep learning-based breast cancer detection in real-world clinical settings.

Overall, this paper represents an important contribution to the field of unsupervised domain adaptation for medical imaging, and the proposed BTMuda method shows promise for improving the clinical viability of deep learning-based breast cancer diagnosis. Further research to address the identified limitations and explore the broader applicability of the approach would be valuable.

Conclusion

This paper presents a novel Bi-level Multi-source unsupervised domain adaptation (BTMuda) method for improving deep learning-based breast cancer diagnosis. The key innovations include a mixed feature extractor, a redesigned transformer architecture, and alignment modules to address both intra-domain and inter-domain shift issues.

Through extensive experiments, the researchers demonstrate that BTMuda outperforms state-of-the-art UDA methods, highlighting the importance of tackling domain shift problems and leveraging multiple relevant source domains to enhance the clinical applicability of deep learning in breast cancer detection.

While the paper makes a valuable contribution to the field, further research is needed to address potential limitations and explore the broader applicability of the BTMuda approach to other medical imaging domains. Nonetheless, this work represents a significant step forward in making deep learning-based breast cancer diagnosis more practical and accessible for real-world clinical use.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🤷

BTMuda: A Bi-level Multi-source unsupervised domain adaptation framework for breast cancer diagnosis

Yuxiang Yang, Xinyi Zeng, Pinxian Zeng, Binyu Yan, Xi Wu, Jiliu Zhou, Yan Wang

Deep learning has revolutionized the early detection of breast cancer, resulting in a significant decrease in mortality rates. However, difficulties in obtaining annotations and huge variations in distribution between training sets and real scenes have limited their clinical applications. To address these limitations, unsupervised domain adaptation (UDA) methods have been used to transfer knowledge from one labeled source domain to the unlabeled target domain, yet these approaches suffer from severe domain shift issues and often ignore the potential benefits of leveraging multiple relevant sources in practical applications. To address these limitations, in this work, we construct a Three-Branch Mixed extractor and propose a Bi-level Multi-source unsupervised domain adaptation method called BTMuda for breast cancer diagnosis. Our method addresses the problems of domain shift by dividing domain shift issues into two levels: intra-domain and inter-domain. To reduce the intra-domain shift, we jointly train a CNN and a Transformer as two paths of a domain mixed feature extractor to obtain robust representations rich in both low-level local and high-level global information. As for the inter-domain shift, we redesign the Transformer delicately to a three-branch architecture with cross-attention and distillation, which learns domain-invariant representations from multiple domains. Besides, we introduce two alignment modules - one for feature alignment and one for classifier alignment - to improve the alignment process. Extensive experiments conducted on three public mammographic datasets demonstrate that our BTMuda outperforms state-of-the-art methods.

9/2/2024

D-MASTER: Mask Annealed Transformer for Unsupervised Domain Adaptation in Breast Cancer Detection from Mammograms

Tajamul Ashraf, Krithika Rangarajan, Mohit Gambhir, Richa Gabha, Chetan Arora

We focus on the problem of Unsupervised Domain Adaptation (uda) for breast cancer detection from mammograms (BCDM) problem. Recent advancements have shown that masked image modeling serves as a robust pretext task for UDA. However, when applied to cross-domain BCDM, these techniques struggle with breast abnormalities such as masses, asymmetries, and micro-calcifications, in part due to the typically much smaller size of region of interest in comparison to natural images. This often results in more false positives per image (FPI) and significant noise in pseudo-labels typically used to bootstrap such techniques. Recognizing these challenges, we introduce a transformer-based Domain-invariant Mask Annealed Student Teacher autoencoder (D-MASTER) framework. D-MASTER adaptively masks and reconstructs multi-scale feature maps, enhancing the model's ability to capture reliable target domain features. D-MASTER also includes adaptive confidence refinement to filter pseudo-labels, ensuring only high-quality detections are considered. We also provide a bounding box annotated subset of 1000 mammograms from the RSNA Breast Screening Dataset (referred to as RSNA-BSD1K) to support further research in BCDM. We evaluate D-MASTER on multiple BCDM datasets acquired from diverse domains. Experimental results show a significant improvement of 9% and 13% in sensitivity at 0.3 FPI over state-of-the-art UDA techniques on publicly available benchmark INBreast and DDSM datasets respectively. We also report an improvement of 11% and 17% on In-house and RSNA-BSD1K datasets respectively. The source code, pre-trained D-MASTER model, along with RSNA-BSD1K dataset annotations is available at https://dmaster-iitd.github.io/webpage.

7/10/2024

🤷

Unsupervised Domain Adaptation for Low-dose CT Reconstruction via Bayesian Uncertainty Alignment

Kecheng Chen, Jie Liu, Renjie Wan, Victor Ho-Fun Lee, Varut Vardhanabhuti, Hong Yan, Haoliang Li

Low-dose computed tomography (LDCT) image reconstruction techniques can reduce patient radiation exposure while maintaining acceptable imaging quality. Deep learning is widely used in this problem, but the performance of testing data (a.k.a. target domain) is often degraded in clinical scenarios due to the variations that were not encountered in training data (a.k.a. source domain). Unsupervised domain adaptation (UDA) of LDCT reconstruction has been proposed to solve this problem through distribution alignment. However, existing UDA methods fail to explore the usage of uncertainty quantification, which is crucial for reliable intelligent medical systems in clinical scenarios with unexpected variations. Moreover, existing direct alignment for different patients would lead to content mismatch issues. To address these issues, we propose to leverage a probabilistic reconstruction framework to conduct a joint discrepancy minimization between source and target domains in both the latent and image spaces. In the latent space, we devise a Bayesian uncertainty alignment to reduce the epistemic gap between the two domains. This approach reduces the uncertainty level of target domain data, making it more likely to render well-reconstructed results on target domains. In the image space, we propose a sharpness-aware distribution alignment to achieve a match of second-order information, which can ensure that the reconstructed images from the target domain have similar sharpness to normal-dose CT images from the source domain. Experimental results on two simulated datasets and one clinical low-dose imaging dataset show that our proposed method outperforms other methods in quantitative and visualized performance.

6/4/2024

M3BAT: Unsupervised Domain Adaptation for Multimodal Mobile Sensing with Multi-Branch Adversarial Training

Lakmal Meegahapola, Hamza Hassoune, Daniel Gatica-Perez

Over the years, multimodal mobile sensing has been used extensively for inferences regarding health and well being, behavior, and context. However, a significant challenge hindering the widespread deployment of such models in real world scenarios is the issue of distribution shift. This is the phenomenon where the distribution of data in the training set differs from the distribution of data in the real world, the deployment environment. While extensively explored in computer vision and natural language processing, and while prior research in mobile sensing briefly addresses this concern, current work primarily focuses on models dealing with a single modality of data, such as audio or accelerometer readings, and consequently, there is little research on unsupervised domain adaptation when dealing with multimodal sensor data. To address this gap, we did extensive experiments with domain adversarial neural networks (DANN) showing that they can effectively handle distribution shifts in multimodal sensor data. Moreover, we proposed a novel improvement over DANN, called M3BAT, unsupervised domain adaptation for multimodal mobile sensing with multi-branch adversarial training, to account for the multimodality of sensor data during domain adaptation with multiple branches. Through extensive experiments conducted on two multimodal mobile sensing datasets, three inference tasks, and 14 source-target domain pairs, including both regression and classification, we demonstrate that our approach performs effectively on unseen domains. Compared to directly deploying a model trained in the source domain to the target domain, the model shows performance increases up to 12% AUC (area under the receiver operating characteristics curves) on classification tasks, and up to 0.13 MAE (mean absolute error) on regression tasks.

4/29/2024