Ultrasound SAM Adapter: Adapting SAM for Breast Lesion Segmentation in Ultrasound Images

Read original: arXiv:2404.14837 - Published 4/24/2024 by Zhengzheng Tu, Le Gu, Xixi Wang, Bo Jiang

🤖

Overview

The paper introduces a novel Breast Ultrasound Segment Anything Model (BUSSAM) that adapts the Segment Anything Model (SAM) to the domain of breast ultrasound image segmentation.
The key challenges addressed are the large domain gap between natural and medical images, as well as the lack of large-scale annotated ultrasound image data.
The proposed BUSSAM model uses a novel CNN image encoder, a Cross-Branch Adapter, and Position and Feature Adapters to effectively leverage the SAM for breast ultrasound image segmentation.

Plain English Explanation

The Segment Anything Model (SAM) is a powerful AI system that can accurately segment objects in natural images. However, it struggles with medical images, like ultrasound scans, because they are very different from the natural images it was trained on.

To address this, the researchers developed a new model called the Breast Ultrasound Segment Anything Model (BUSSAM). BUSSAM takes the original SAM and adapts it to work well with breast ultrasound images.

The key innovations are:

A new CNN-based image encoder that focuses on learning the unique features of ultrasound images, which complements the ViT-based encoder in SAM.
A "Cross-Branch Adapter" that allows the CNN and ViT encoders to effectively work together.
"Position" and "Feature" adapters that fine-tune the original SAM to perform better on ultrasound data.

By incorporating these novel components, BUSSAM is able to significantly outperform other medical image segmentation models on benchmark breast ultrasound datasets. This research demonstrates how adapting powerful AI models like SAM can unlock their potential for specialized medical applications, like zero-shot medical video analysis.

Technical Explanation

The paper begins by highlighting the limitations of the Segment Anything Model (SAM) for medical image segmentation tasks, particularly in the domain of breast ultrasound imaging. The authors identify the large domain gap between natural and medical images, as well as the scarcity of large-scale annotated ultrasound image data, as the key challenges.

To address these issues, the researchers develop the Breast Ultrasound Segment Anything Model (BUSSAM). The core components of BUSSAM include:

Novel CNN Image Encoder: The authors design a lightweight CNN-based image encoder that focuses on learning features from the local receptive field of ultrasound images. This complements the Vision Transformer (ViT) encoder used in the original SAM.
Cross-Branch Adapter: A novel module that allows the CNN encoder and ViT encoder to fully interact and share information, leveraging the strengths of both.
Position and Feature Adapters: These adapters are added to the ViT branch of the original SAM to fine-tune it for the breast ultrasound image segmentation task.

The experimental results on the AMUBUS and BUSI breast ultrasound datasets demonstrate that BUSSAM significantly outperforms other medical image segmentation models. This highlights the effectiveness of the proposed adaptations in bridging the domain gap and enabling the powerful SAM to perform well on ultrasound images.

Critical Analysis

The paper presents a well-designed and comprehensive approach to adapting the Segment Anything Model (SAM) for breast ultrasound image segmentation. The researchers have carefully identified the key challenges and limitations of using SAM for medical imaging tasks, and have developed novel solutions to address them.

One potential limitation of the BUSSAM model is that it may still require a substantial amount of annotated ultrasound data for effective fine-tuning, despite the use of adapters. The authors do not discuss the sensitivity of their approach to the size of the training dataset, which could be an important consideration for practical deployment.

Additionally, the paper focuses on breast ultrasound segmentation, and it would be interesting to see how the BUSSAM model could be further extended or adapted to other types of medical imaging modalities, such as CT or MRI scans. Exploring the generalizability of the proposed techniques could further enhance their impact on the broader medical imaging field.

Overall, the BUSSAM model represents a valuable contribution to the field of medical image segmentation, demonstrating the potential of adapting powerful general-purpose models like SAM to specialized domains. The researchers have presented a well-executed study with promising results, and their work serves as a foundation for further advancements in this area.

Conclusion

The Breast Ultrasound Segment Anything Model (BUSSAM) proposed in this paper addresses the challenge of adapting the Segment Anything Model (SAM) to the domain of breast ultrasound image segmentation. By developing a novel CNN image encoder, a Cross-Branch Adapter, and Position and Feature Adapters, the researchers have successfully bridged the gap between natural and medical images, enabling SAM to achieve state-of-the-art performance on breast ultrasound datasets.

This work highlights the potential of leveraging powerful general-purpose AI models, like SAM, for specialized medical applications through careful adaptation and domain-specific enhancements. The innovative techniques presented in this paper pave the way for further advancements in medical image segmentation, with broader implications for the integration of advanced AI models into healthcare workflows and decision-making processes.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🤖

Ultrasound SAM Adapter: Adapting SAM for Breast Lesion Segmentation in Ultrasound Images

Zhengzheng Tu, Le Gu, Xixi Wang, Bo Jiang

Segment Anything Model (SAM) has recently achieved amazing results in the field of natural image segmentation. However, it is not effective for medical image segmentation, owing to the large domain gap between natural and medical images. In this paper, we mainly focus on ultrasound image segmentation. As we know that it is very difficult to train a foundation model for ultrasound image data due to the lack of large-scale annotated ultrasound image data. To address these issues, in this paper, we develop a novel Breast Ultrasound SAM Adapter, termed Breast Ultrasound Segment Anything Model (BUSSAM), which migrates the SAM to the field of breast ultrasound image segmentation by using the adapter technique. To be specific, we first design a novel CNN image encoder, which is fully trained on the BUS dataset. Our CNN image encoder is more lightweight, and focuses more on features of local receptive field, which provides the complementary information to the ViT branch in SAM. Then, we design a novel Cross-Branch Adapter to allow the CNN image encoder to fully interact with the ViT image encoder in SAM module. Finally, we add both of the Position Adapter and the Feature Adapter to the ViT branch to fine-tune the original SAM. The experimental results on AMUBUS and BUSI datasets demonstrate that our proposed model outperforms other medical image segmentation models significantly. Our code will be available at: https://github.com/bscs12/BUSSAM.

4/24/2024

CC-SAM: SAM with Cross-feature Attention and Context for Ultrasound Image Segmentation

Shreyank N Gowda, David A. Clifton

The Segment Anything Model (SAM) has achieved remarkable successes in the realm of natural image segmentation, but its deployment in the medical imaging sphere has encountered challenges. Specifically, the model struggles with medical images that feature low contrast, faint boundaries, intricate morphologies, and small-sized objects. To address these challenges and enhance SAM's performance in the medical domain, we introduce a comprehensive modification. Firstly, we incorporate a frozen Convolutional Neural Network (CNN) branch as an image encoder, which synergizes with SAM's original Vision Transformer (ViT) encoder through a novel variational attention fusion module. This integration bolsters the model's capability to capture local spatial information, which is often paramount in medical imagery. Moreover, to further optimize SAM for medical imaging, we introduce feature and position adapters within the ViT branch, refining the encoder's representations. We see that compared to current prompting strategies to fine-tune SAM for ultrasound medical segmentation, the use of text descriptions that serve as text prompts for SAM helps significantly improve the performance. Leveraging ChatGPT's natural language understanding capabilities, we generate prompts that offer contextual information and guidance to SAM, enabling it to better understand the nuances of ultrasound medical images and improve its segmentation accuracy. Our method, in its entirety, represents a significant stride towards making universal image segmentation models more adaptable and efficient in the medical domain.

8/2/2024

🖼️

Beyond Adapting SAM: Towards End-to-End Ultrasound Image Segmentation via Auto Prompting

Xian Lin, Yangyang Xiang, Li Yu, Zengqiang Yan

End-to-end medical image segmentation is of great value for computer-aided diagnosis dominated by task-specific models, usually suffering from poor generalization. With recent breakthroughs brought by the segment anything model (SAM) for universal image segmentation, extensive efforts have been made to adapt SAM for medical imaging but still encounter two major issues: 1) severe performance degradation and limited generalization without proper adaptation, and 2) semi-automatic segmentation relying on accurate manual prompts for interaction. In this work, we propose SAMUS as a universal model tailored for ultrasound image segmentation and further enable it to work in an end-to-end manner denoted as AutoSAMUS. Specifically, in SAMUS, a parallel CNN branch is introduced to supplement local information through cross-branch attention, and a feature adapter and a position adapter are jointly used to adapt SAM from natural to ultrasound domains while reducing training complexity. AutoSAMUS is realized by introducing an auto prompt generator (APG) to replace the manual prompt encoder of SAMUS to automatically generate prompt embeddings. A comprehensive ultrasound dataset, comprising about 30k images and 69k masks and covering six object categories, is collected for verification. Extensive comparison experiments demonstrate the superiority of SAMUS and AutoSAMUS against the state-of-the-art task-specific and SAM-based foundation models. We believe the auto-prompted SAM-based model has the potential to become a new paradigm for end-to-end medical image segmentation and deserves more exploration. Code and data are available at https://github.com/xianlin7/SAMUS.

7/9/2024

Mask-Enhanced Segment Anything Model for Tumor Lesion Semantic Segmentation

Hairong Shi, Songhao Han, Shaofei Huang, Yue Liao, Guanbin Li, Xiangxing Kong, Hua Zhu, Xiaomu Wang, Si Liu

Tumor lesion segmentation on CT or MRI images plays a critical role in cancer diagnosis and treatment planning. Considering the inherent differences in tumor lesion segmentation data across various medical imaging modalities and equipment, integrating medical knowledge into the Segment Anything Model (SAM) presents promising capability due to its versatility and generalization potential. Recent studies have attempted to enhance SAM with medical expertise by pre-training on large-scale medical segmentation datasets. However, challenges still exist in 3D tumor lesion segmentation owing to tumor complexity and the imbalance in foreground and background regions. Therefore, we introduce Mask-Enhanced SAM (M-SAM), an innovative architecture tailored for 3D tumor lesion segmentation. We propose a novel Mask-Enhanced Adapter (MEA) within M-SAM that enriches the semantic information of medical images with positional data from coarse segmentation masks, facilitating the generation of more precise segmentation masks. Furthermore, an iterative refinement scheme is implemented in M-SAM to refine the segmentation masks progressively, leading to improved performance. Extensive experiments on seven tumor lesion segmentation datasets indicate that our M-SAM not only achieves high segmentation accuracy but also exhibits robust generalization. The code is available at https://github.com/nanase1025/M-SAM.

7/12/2024