DeSAM: Decoupled Segment Anything Model for Generalizable Medical Image Segmentation

Read original: arXiv:2306.00499 - Published 7/10/2024 by Yifan Gao, Wei Xia, Dingdu Hu, Wenkui Wang, Xin Gao

📈

Overview

Medical image segmentation models often struggle with domain shift, where they perform poorly on data from unseen domains
The Segment Anything Model (SAM) shows promise for improving cross-domain robustness, but its performance degrades in automatic segmentation scenarios compared to manual prompting
The authors propose Decoupled SAM (DeSAM), which modifies SAM to address the coupling effect of poor prompts and mask generation

Plain English Explanation

Artificial intelligence (AI) models trained to segment medical images often have trouble when applied to new types of medical data they haven't seen before. This is known as the domain shift problem. The Segment Anything Model (SAM) is an AI model that has shown potential for improving this cross-domain robustness, meaning it can perform well on different types of medical data.

However, SAM's performance degrades when it tries to automatically segment images without being manually prompted by a human. To address this issue, the researchers developed a new model called Decoupled SAM (DeSAM). DeSAM modifies SAM's architecture to separate the prompt-relevant information from the mask generation, allowing it to leverage SAM's pre-trained weights while minimizing the performance drop caused by poor prompts.

The researchers tested DeSAM on medical image segmentation datasets involving different imaging modalities and anatomical regions. The results show that DeSAM significantly outperforms previous state-of-the-art domain generalization methods, meaning it can adapt better to new types of medical data.

Technical Explanation

The authors observed that SAM, a powerful prompt-driven foundation model for image segmentation, performs much worse in automatic segmentation scenarios compared to when it is manually prompted by a human. They discovered that this degradation in performance is related to the coupling effect of inevitable poor prompts and mask generation.

To address this coupling effect, the authors propose Decoupled SAM (DeSAM). DeSAM modifies SAM's mask decoder by introducing two new modules: a prompt-relevant IoU module (PRIM) and a prompt-decoupled mask module (PDMM). PRIM predicts the IoU (Intersection over Union) score and generates mask embeddings, while PDMM extracts multi-scale features from the intermediate layers of the image encoder and fuses them with the mask embeddings from PRIM to generate the final segmentation mask.

This decoupled design allows DeSAM to leverage the pre-trained weights of SAM while minimizing the performance degradation caused by poor prompts. The authors conducted experiments on publicly available cross-site prostate and cross-modality abdominal image segmentation datasets, and the results show that DeSAM outperforms previous state-of-the-art domain generalization methods.

Critical Analysis

The authors acknowledge that their proposed DeSAM model still has room for improvement, particularly in terms of the quality of the generated segmentation masks. They suggest that further research could explore ways to better integrate the prompt-relevant information into the mask generation process, potentially through more sophisticated fusion techniques or the use of perturbed prompts for robust adaptation.

Additionally, the authors only evaluated DeSAM on medical image segmentation tasks, and it would be valuable to assess its performance on a broader range of computer vision applications, including surgical instrument segmentation or other types of complex visual understanding tasks.

Conclusion

The Decoupled SAM (DeSAM) model proposed in this paper represents a significant step forward in addressing the domain shift problem in medical image segmentation. By decoupling the prompt-relevant information from the mask generation process, DeSAM is able to leverage the powerful Segment Anything Model (SAM) while overcoming its performance degradation in automatic segmentation scenarios.

The authors' empirical results demonstrate that DeSAM outperforms previous state-of-the-art domain generalization methods, suggesting that this approach has promising implications for improving the robustness and applicability of medical image segmentation AI models in real-world clinical settings.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

📈

DeSAM: Decoupled Segment Anything Model for Generalizable Medical Image Segmentation

Yifan Gao, Wei Xia, Dingdu Hu, Wenkui Wang, Xin Gao

Deep learning-based medical image segmentation models often suffer from domain shift, where the models trained on a source domain do not generalize well to other unseen domains. As a prompt-driven foundation model with powerful generalization capabilities, the Segment Anything Model (SAM) shows potential for improving the cross-domain robustness of medical image segmentation. However, SAM performs significantly worse in automatic segmentation scenarios than when manually prompted, hindering its direct application to domain generalization. Upon further investigation, we discovered that the degradation in performance was related to the coupling effect of inevitable poor prompts and mask generation. To address the coupling effect, we propose the Decoupled SAM (DeSAM). DeSAM modifies SAM's mask decoder by introducing two new modules: a prompt-relevant IoU module (PRIM) and a prompt-decoupled mask module (PDMM). PRIM predicts the IoU score and generates mask embeddings, while PDMM extracts multi-scale features from the intermediate layers of the image encoder and fuses them with the mask embeddings from PRIM to generate the final segmentation mask. This decoupled design allows DeSAM to leverage the pre-trained weights while minimizing the performance degradation caused by poor prompts. We conducted experiments on publicly available cross-site prostate and cross-modality abdominal image segmentation datasets. The results show that our DeSAM leads to a substantial performance improvement over previous state-of-theart domain generalization methods. The code is publicly available at https://github.com/yifangao112/DeSAM.

7/10/2024

I-MedSAM: Implicit Medical Image Segmentation with Segment Anything

Xiaobao Wei, Jiajun Cao, Yizhu Jin, Ming Lu, Guangyu Wang, Shanghang Zhang

With the development of Deep Neural Networks (DNNs), many efforts have been made to handle medical image segmentation. Traditional methods such as nnUNet train specific segmentation models on the individual datasets. Plenty of recent methods have been proposed to adapt the foundational Segment Anything Model (SAM) to medical image segmentation. However, they still focus on discrete representations to generate pixel-wise predictions, which are spatially inflexible and scale poorly to higher resolution. In contrast, implicit methods learn continuous representations for segmentation, which is crucial for medical image segmentation. In this paper, we propose I-MedSAM, which leverages the benefits of both continuous representations and SAM, to obtain better cross-domain ability and accurate boundary delineation. Since medical image segmentation needs to predict detailed segmentation boundaries, we designed a novel adapter to enhance the SAM features with high-frequency information during Parameter-Efficient Fine-Tuning (PEFT). To convert the SAM features and coordinates into continuous segmentation output, we utilize Implicit Neural Representation (INR) to learn an implicit segmentation decoder. We also propose an uncertainty-guided sampling strategy for efficient learning of INR. Extensive evaluations on 2D medical image segmentation tasks have shown that our proposed method with only 1.6M trainable parameters outperforms existing methods including discrete and implicit methods. The code will be available at: https://github.com/ucwxb/I-MedSAM.

7/12/2024

📈

nnSAM: Plug-and-play Segment Anything Model Improves nnUNet Performance

Yunxiang Li, Bowen Jing, Zihan Li, Jing Wang, You Zhang

Automatic segmentation of medical images is crucial in modern clinical workflows. The Segment Anything Model (SAM) has emerged as a versatile tool for image segmentation without specific domain training, but it requires human prompts and may have limitations in specific domains. Traditional models like nnUNet perform automatic segmentation during inference and are effective in specific domains but need extensive domain-specific training. To combine the strengths of foundational and domain-specific models, we propose nnSAM, integrating SAM's robust feature extraction with nnUNet's automatic configuration to enhance segmentation accuracy on small datasets. Our nnSAM model optimizes two main approaches: leveraging SAM's feature extraction and nnUNet's domain-specific adaptation, and incorporating a boundary shape supervision loss function based on level set functions and curvature calculations to learn anatomical shape priors from limited data. We evaluated nnSAM on four segmentation tasks: brain white matter, liver, lung, and heart segmentation. Our method outperformed others, achieving the highest DICE score of 82.77% and the lowest ASD of 1.14 mm in brain white matter segmentation with 20 training samples, compared to nnUNet's DICE score of 79.25% and ASD of 1.36 mm. A sample size study highlighted nnSAM's advantage with fewer training samples. Our results demonstrate significant improvements in segmentation performance with nnSAM, showcasing its potential for small-sample learning in medical image segmentation.

5/16/2024

RobustSAM: Segment Anything Robustly on Degraded Images

Wei-Ting Chen, Yu-Jiet Vong, Sy-Yen Kuo, Sizhuo Ma, Jian Wang

Segment Anything Model (SAM) has emerged as a transformative approach in image segmentation, acclaimed for its robust zero-shot segmentation capabilities and flexible prompting system. Nonetheless, its performance is challenged by images with degraded quality. Addressing this limitation, we propose the Robust Segment Anything Model (RobustSAM), which enhances SAM's performance on low-quality images while preserving its promptability and zero-shot generalization. Our method leverages the pre-trained SAM model with only marginal parameter increments and computational requirements. The additional parameters of RobustSAM can be optimized within 30 hours on eight GPUs, demonstrating its feasibility and practicality for typical research laboratories. We also introduce the Robust-Seg dataset, a collection of 688K image-mask pairs with different degradations designed to train and evaluate our model optimally. Extensive experiments across various segmentation tasks and datasets confirm RobustSAM's superior performance, especially under zero-shot conditions, underscoring its potential for extensive real-world application. Additionally, our method has been shown to effectively improve the performance of SAM-based downstream tasks such as single image dehazing and deblurring.

6/17/2024