Learning a Clinically-Relevant Concept Bottleneck for Lesion Detection in Breast Ultrasound

Read original: arXiv:2407.00267 - Published 7/2/2024 by Arianna Bunnell, Yannik Glaser, Dustin Valdez, Thomas Wolfgruber, Aleen Altamirano, Carol Zamora Gonz'alez, Brenda Y. Hernandez, Peter Sadowski, John A. Shepherd

🔎

Overview

This paper proposes an explainable AI model for detecting and classifying lesions in breast ultrasound images.
The model uses a concept bottleneck layer to predict known features from the American College of Radiology's Breast Imaging and Reporting Data System (BI-RADS) before making a final cancer classification.
This allows radiologists to review and potentially fix the AI's predictions by modifying the concept predictions.
Experiments show the model outperforms state-of-the-art lesion detection frameworks and the concept intervention improves cancer classification performance.

Plain English Explanation

Breast cancer is a major health concern, especially in regions with limited access to mammography screening. Artificial intelligence (AI) systems can help detect and classify lesions in breast ultrasound images, potentially reducing the burden of cancer in these areas. However, for AI systems to be useful in a clinical setting, their predictions need to be easy for radiologists to understand and verify.

This paper presents an AI model that provides interpretable predictions using a standard medical terminology, the BI-RADS lexicon. The model first predicts the known BI-RADS features of a lesion, such as its shape, margin, and orientation, before making a final prediction about whether the lesion is cancerous.

This "concept bottleneck" approach allows radiologists to review the AI's reasoning and make adjustments if needed. For example, if the AI correctly identifies the lesion's features but misclassifies it as cancerous, the radiologist can override the final prediction by modifying the concept predictions.

In experiments, this model outperformed other state-of-the-art lesion detection frameworks and the concept intervention was shown to improve the overall cancer classification performance.

Technical Explanation

The proposed model is a deep neural network that incorporates a concept bottleneck layer to predict known BI-RADS features before making a final cancer classification. This enables radiologists to easily review and potentially fix the AI's predictions by modifying the concept predictions.

The researchers trained and evaluated the model on a dataset of 8,854 breast ultrasound images from 994 women, with expert annotations and histological cancer labels. The model achieved an average precision of 48.9% on lesion detection, outperforming other state-of-the-art frameworks. For cancer classification, the concept intervention was shown to increase the area under the receiver operating characteristic curve from 0.876 to 0.885, demonstrating the benefits of the interpretable approach.

Critical Analysis

The paper provides a promising approach for developing explainable AI systems for breast cancer diagnosis using ultrasound imaging. The concept bottleneck layer allows radiologists to understand and potentially correct the AI's predictions, which is an important step towards building trust and adoption in clinical settings.

However, the paper does not address several potential limitations. For example, the dataset used for training and evaluation is relatively small, and it's unclear how the model would perform on a larger, more diverse set of images. Additionally, the paper does not discuss the computational efficiency of the model, which is an important consideration for real-time clinical use.

Furthermore, the paper does not compare the performance of the concept bottleneck model to that of radiologists' own interpretations of the BI-RADS features. It would be valuable to understand how the AI's concept predictions align with human experts' assessments and whether the concept intervention leads to meaningful improvements in clinical decision-making.

Conclusion

This paper presents an innovative approach to developing explainable AI systems for breast cancer diagnosis using ultrasound imaging. By incorporating a concept bottleneck layer that predicts known BI-RADS features, the model allows radiologists to review and potentially correct the AI's predictions, enhancing its clinical utility.

The promising results suggest that this type of interpretable AI model could be a valuable tool for improving breast cancer detection and classification, especially in regions with limited access to mammography. Further research is needed to address the limitations and explore the real-world impact of this technology in clinical settings.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🔎

Learning a Clinically-Relevant Concept Bottleneck for Lesion Detection in Breast Ultrasound

Arianna Bunnell, Yannik Glaser, Dustin Valdez, Thomas Wolfgruber, Aleen Altamirano, Carol Zamora Gonz'alez, Brenda Y. Hernandez, Peter Sadowski, John A. Shepherd

Detecting and classifying lesions in breast ultrasound images is a promising application of artificial intelligence (AI) for reducing the burden of cancer in regions with limited access to mammography. Such AI systems are more likely to be useful in a clinical setting if their predictions can be explained to a radiologist. This work proposes an explainable AI model that provides interpretable predictions using a standard lexicon from the American College of Radiology's Breast Imaging and Reporting Data System (BI-RADS). The model is a deep neural network featuring a concept bottleneck layer in which known BI-RADS features are predicted before making a final cancer classification. This enables radiologists to easily review the predictions of the AI system and potentially fix errors in real time by modifying the concept predictions. In experiments, a model is developed on 8,854 images from 994 women with expert annotations and histological cancer labels. The model outperforms state-of-the-art lesion detection frameworks with 48.9 average precision on the held-out testing set, and for cancer classification, concept intervention is shown to increase performance from 0.876 to 0.885 area under the receiver operating characteristic curve. Training and evaluation code is available at https://github.com/hawaii-ai/bus-cbm.

7/2/2024

Integrating Clinical Knowledge into Concept Bottleneck Models

Winnie Pang, Xueyi Ke, Satoshi Tsutsui, Bihan Wen

Concept bottleneck models (CBMs), which predict human-interpretable concepts (e.g., nucleus shapes in cell images) before predicting the final output (e.g., cell type), provide insights into the decision-making processes of the model. However, training CBMs solely in a data-driven manner can introduce undesirable biases, which may compromise prediction performance, especially when the trained models are evaluated on out-of-domain images (e.g., those acquired using different devices). To mitigate this challenge, we propose integrating clinical knowledge to refine CBMs, better aligning them with clinicians' decision-making processes. Specifically, we guide the model to prioritize the concepts that clinicians also prioritize. We validate our approach on two datasets of medical images: white blood cell and skin images. Empirical validation demonstrates that incorporating medical guidance enhances the model's classification performance on unseen datasets with varying preparation methods, thereby increasing its real-world applicability.

7/10/2024

🏷️

Breast tumor classification based on self-supervised contrastive learning from ultrasound videos

Yunxin Tang, Siyuan Tang, Jian Zhang, Hao Chen

Background: Breast ultrasound is prominently used in diagnosing breast tumors. At present, many automatic systems based on deep learning have been developed to help radiologists in diagnosis. However, training such systems remains challenging because they are usually data-hungry and demand amounts of labeled data, which need professional knowledge and are expensive. Methods: We adopted a triplet network and a self-supervised contrastive learning technique to learn representations from unlabeled breast ultrasound video clips. We further designed a new hard triplet loss to to learn representations that particularly discriminate positive and negative image pairs that are hard to recognize. We also constructed a pretraining dataset from breast ultrasound videos (1,360 videos from 200 patients), which includes an anchor sample dataset with 11,805 images, a positive sample dataset with 188,880 images, and a negative sample dataset dynamically generated from video clips. Further, we constructed a finetuning dataset, including 400 images from 66 patients. We transferred the pretrained network to a downstream benign/malignant classification task and compared the performance with other state-of-the-art models, including three models pretrained on ImageNet and a previous contrastive learning model retrained on our datasets. Results and conclusion: Experiments revealed that our model achieved an area under the receiver operating characteristic curve (AUC) of 0.952, which is significantly higher than the others. Further, we assessed the dependence of our pretrained model on the number of labeled data and revealed that <100 samples were required to achieve an AUC of 0.901. The proposed framework greatly reduces the demand for labeled data and holds potential for use in automatic breast ultrasound image diagnosis.

8/21/2024

🌿

Coarse-to-Fine Concept Bottleneck Models

Konstantinos P. Panousis, Dino Ienco, Diego Marcos

Deep learning algorithms have recently gained significant attention due to their impressive performance. However, their high complexity and un-interpretable mode of operation hinders their confident deployment in real-world safety-critical tasks. This work targets ante hoc interpretability, and specifically Concept Bottleneck Models (CBMs). Our goal is to design a framework that admits a highly interpretable decision making process with respect to human understandable concepts, on two levels of granularity. To this end, we propose a novel two-level concept discovery formulation leveraging: (i) recent advances in vision-language models, and (ii) an innovative formulation for coarse-to-fine concept selection via data-driven and sparsity-inducing Bayesian arguments. Within this framework, concept information does not solely rely on the similarity between the whole image and general unstructured concepts; instead, we introduce the notion of concept hierarchy to uncover and exploit more granular concept information residing in patch-specific regions of the image scene. As we experimentally show, the proposed construction not only outperforms recent CBM approaches, but also yields a principled framework towards interpetability.

6/28/2024