AutoProSAM: Automated Prompting SAM for 3D Multi-Organ Segmentation

Read original: arXiv:2308.14936 - Published 6/28/2024 by Chengyin Li, Prashant Khanduri, Yao Qiang, Rafi Ibn Sultan, Indrin Chetty, Dongxiao Zhu

🛠️

Overview

The Segment Anything Model (SAM) is a pioneering prompt-based foundation model for image segmentation, which has been widely adopted for various medical imaging applications.
However, creating effective prompts for SAM in clinical settings is challenging and time-consuming, requiring the expertise of domain specialists like physicians.
Additionally, recent studies have shown that SAM, originally designed for 2D natural images, performs sub-optimally on 3D medical image segmentation tasks due to the domain gaps between natural and medical images, as well as the differences in spatial arrangements between 2D and 3D images.

Plain English Explanation

To address these challenges, researchers have developed a novel technique called AutoProSAM. This method automates 3D multi-organ CT-based segmentation by leveraging SAM's foundational model capabilities without relying on domain experts for prompts.

The key idea is to use parameter-efficient adaptation techniques to adapt SAM for 3D medical imagery and incorporate an effective automatic prompt learning paradigm specific to this domain. By eliminating the need for manual prompts, AutoProSAM enhances SAM's capabilities for 3D medical image segmentation and achieves state-of-the-art performance in CT-based multi-organ segmentation tasks.

Technical Explanation

The researchers present AutoProSAM, a novel technique that automates 3D multi-organ CT-based segmentation by leveraging SAM's foundational model capabilities without relying on domain experts for prompts. The approach utilizes parameter-efficient adaptation techniques, such as simsam and SAM3D, to adapt SAM for 3D medical imagery. Additionally, it incorporates an effective automatic prompt learning paradigm specific to this domain, eliminating the need for manual prompts.

The researchers also explore other techniques, such as SLIDE-SAM and NNSam, to further enhance the performance of AutoProSAM in 3D medical image segmentation tasks.

Critical Analysis

The researchers acknowledge that while AutoProSAM addresses the challenges of SAM in clinical settings, there may be additional limitations or areas for further research. For instance, the performance of the automatic prompt learning paradigm may be sensitive to the quality and diversity of the training data, and the effectiveness of the parameter-efficient adaptation techniques may vary depending on the specific medical imaging modality and the complexity of the segmentation tasks.

Moreover, the researchers do not explore the potential biases or ethical considerations that may arise from the widespread adoption of such automated segmentation models in medical settings, which could be an important area for further investigation.

Conclusion

In summary, the AutoProSAM technique presents a promising approach to enhance the capabilities of the Segment Anything Model (SAM) for 3D medical image segmentation tasks. By automating the prompt creation process and adapting SAM for 3D medical imagery, AutoProSAM can significantly improve the accessibility and performance of SAM in clinical settings, potentially leading to more efficient and accurate medical image analysis and diagnostic procedures.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🛠️

AutoProSAM: Automated Prompting SAM for 3D Multi-Organ Segmentation

Chengyin Li, Prashant Khanduri, Yao Qiang, Rafi Ibn Sultan, Indrin Chetty, Dongxiao Zhu

Segment Anything Model (SAM) is one of the pioneering prompt-based foundation models for image segmentation and has been rapidly adopted for various medical imaging applications. However, in clinical settings, creating effective prompts is notably challenging and time-consuming, requiring the expertise of domain specialists such as physicians. This requirement significantly diminishes SAM's primary advantage - its interactive capability with end users - in medical applications. Moreover, recent studies have indicated that SAM, originally designed for 2D natural images, performs sub optimally on 3D medical image segmentation tasks. This subpar performance is attributed to the domain gaps between natural and medical images and the disparities in spatial arrangements between 2D and 3D images, particularly in multi-organ segmentation applications. To overcome these challenges, we present a novel technique termed AutoProSAM. This method automates 3D multi-organ CT-based segmentation by leveraging SAM's foundational model capabilities without relying on domain experts for prompts. The approach utilizes parameter-efficient adaptation techniques to adapt SAM for 3D medical imagery and incorporates an effective automatic prompt learning paradigm specific to this domain. By eliminating the need for manual prompts, it enhances SAM's capabilities for 3D medical image segmentation and achieves state-of-the-art (SOTA) performance in CT-based multi-organ segmentation tasks.

6/28/2024

🖼️

Beyond Adapting SAM: Towards End-to-End Ultrasound Image Segmentation via Auto Prompting

Xian Lin, Yangyang Xiang, Li Yu, Zengqiang Yan

End-to-end medical image segmentation is of great value for computer-aided diagnosis dominated by task-specific models, usually suffering from poor generalization. With recent breakthroughs brought by the segment anything model (SAM) for universal image segmentation, extensive efforts have been made to adapt SAM for medical imaging but still encounter two major issues: 1) severe performance degradation and limited generalization without proper adaptation, and 2) semi-automatic segmentation relying on accurate manual prompts for interaction. In this work, we propose SAMUS as a universal model tailored for ultrasound image segmentation and further enable it to work in an end-to-end manner denoted as AutoSAMUS. Specifically, in SAMUS, a parallel CNN branch is introduced to supplement local information through cross-branch attention, and a feature adapter and a position adapter are jointly used to adapt SAM from natural to ultrasound domains while reducing training complexity. AutoSAMUS is realized by introducing an auto prompt generator (APG) to replace the manual prompt encoder of SAMUS to automatically generate prompt embeddings. A comprehensive ultrasound dataset, comprising about 30k images and 69k masks and covering six object categories, is collected for verification. Extensive comparison experiments demonstrate the superiority of SAMUS and AutoSAMUS against the state-of-the-art task-specific and SAM-based foundation models. We believe the auto-prompted SAM-based model has the potential to become a new paradigm for end-to-end medical image segmentation and deserves more exploration. Code and data are available at https://github.com/xianlin7/SAMUS.

7/9/2024

Segment anything model 2: an application to 2D and 3D medical images

Haoyu Dong, Hanxue Gu, Yaqian Chen, Jichen Yang, Yuwen Chen, Maciej A. Mazurowski

Segment Anything Model (SAM) has gained significant attention because of its ability to segment various objects in images given a prompt. The recently developed SAM 2 has extended this ability to video inputs. This opens an opportunity to apply SAM to 3D images, one of the fundamental tasks in the medical imaging field. In this paper, we extensively evaluate SAM 2's ability to segment both 2D and 3D medical images by first collecting 21 medical imaging datasets, including surgical videos, common 3D modalities such as computed tomography (CT), magnetic resonance imaging (MRI), and positron emission tomography (PET) as well as 2D modalities such as X-ray and ultrasound. Two evaluation settings of SAM 2 are considered: (1) multi-frame 3D segmentation, where prompts are provided to one or multiple slice(s) selected from the volume, and (2) single-frame 2D segmentation, where prompts are provided to each slice. The former only applies to videos and 3D modalities, while the latter applies to all datasets. Our results show that SAM 2 exhibits similar performance as SAM under single-frame 2D segmentation, and has variable performance under multi-frame 3D segmentation depending on the choices of slices to annotate, the direction of the propagation, the predictions utilized during the propagation, etc. We believe our work enhances the understanding of SAM 2's behavior in the medical field and provides directions for future work in adapting SAM 2 to this domain. Our code is available at: https://github.com/mazurowski-lab/segment-anything2-medical-evaluation.

8/23/2024

Segmentation by registration-enabled SAM prompt engineering using five reference images

Yaxi Chen, Aleksandra Ivanova, Shaheer U. Saeed, Rikin Hargunani, Jie Huang, Chaozong Liu, Yipeng Hu

The recently proposed Segment Anything Model (SAM) is a general tool for image segmentation, but it requires additional adaptation and careful fine-tuning for medical image segmentation, especially for small, irregularly-shaped, and boundary-ambiguous anatomical structures such as the knee cartilage that is of interest in this work. Repaired cartilage, after certain surgical procedures, exhibits imaging patterns unseen to pre-training, posing further challenges for using models like SAM with or without general-purpose fine-tuning. To address this, we propose a novel registration-based prompt engineering framework for medical image segmentation using SAM. This approach utilises established image registration algorithms to align the new image (to-be-segmented) and a small number of reference images, without requiring segmentation labels. The spatial transformations generated by registration align either the new image or pre-defined point-based prompts, before using them as input to SAM. This strategy, requiring as few as five reference images with defined point prompts, effectively prompts SAM for inference on new images, without needing any segmentation labels. Evaluation of MR images from patients who received cartilage stem cell therapy yielded Dice scores of 0.89, 0.87, 0.53, and 0.52 for segmenting femur, tibia, femoral- and tibial cartilages, respectively. This outperforms atlas-based label fusion and is comparable to supervised nnUNet, an upper-bound fair baseline in this application, both of which require full segmentation labels for reference samples. The codes are available at: https://github.com/chrissyinreallife/KneeSegmentWithSAM.git

7/26/2024