ASLseg: Adapting SAM in the Loop for Semi-supervised Liver Tumor Segmentation

Read original: arXiv:2312.07969 - Published 5/21/2024 by Shiyun Chen, Li Lin, Pujin Cheng, Xiaoying Tang

ASLseg: Adapting SAM in the Loop for Semi-supervised Liver Tumor Segmentation

Overview

The paper introduces a semi-supervised liver tumor segmentation method called ASLseg (Adapting SAM in the Loop for Semi-supervised Liver Tumor Segmentation).
ASLseg adapts the Segment Anything Model (SAM) for semi-supervised liver tumor segmentation, leveraging a small set of annotated data and a larger set of unlabeled data.
The method uses a self-supervised approach to iteratively refine the segmentation model, gradually improving its performance on the unlabeled data.

Plain English Explanation

The researchers developed a new way to segment, or outline, liver tumors in medical images using a small number of labeled examples and a larger set of unlabeled images. This is an important task for doctors to accurately diagnose and treat cancer.

The key idea is to start with a pre-trained model called the Segment Anything Model (SAM), which is good at outlining objects in general images. The researchers then adapt this model to work specifically for liver tumors, by having it repeatedly practice on the unlabeled medical images and refine its predictions.

This "self-supervised" approach allows the model to gradually improve its tumor segmentation skills, even without having detailed labels for all the training data. The researchers show that this technique, called ASLseg, can achieve good performance with just a small amount of labeled data, making it a practical solution for medical image analysis tasks where labeling data is time-consuming and expensive.

The method builds on previous work like UltraSound-SAM-Adapter, which adapted SAM for breast lesion segmentation, and Shifting to Machine Supervision, which used self-supervised techniques for medical image segmentation. By combining these ideas, the researchers were able to create an effective system for semi-supervised liver tumor segmentation.

Technical Explanation

The key components of the ASLseg method are:

Overall Framework: ASLseg adapts the pre-trained SAM model for the liver tumor segmentation task. It consists of an initial training stage on a small set of labeled data, followed by an iterative self-supervised refinement stage on a larger set of unlabeled data.
SAM Adaptation: The researchers fine-tune the SAM model on the labeled liver tumor data to initialize the segmentation head for this specific task. They also optimize the prompt engineering to better capture the visual characteristics of liver tumors.
Self-Supervised Refinement: In the iterative refinement stage, ASLseg generates pseudo-labels for the unlabeled data using the current model, and then updates the model by training on these pseudo-labels. This allows the model to gradually improve its segmentation performance on the unlabeled data.
Inference and Uncertainty Estimation: At test time, ASLseg uses the refined model to segment liver tumors in new images. It also provides an uncertainty map to help identify regions where the segmentation is less reliable.

The researchers evaluated ASLseg on a liver tumor segmentation dataset, comparing its performance to fully-supervised baselines and other semi-supervised approaches like Test-Time Adaptation with SALIp Cascade and Slide-SAM. The results show that ASLseg can achieve competitive performance using significantly less labeled data, demonstrating the effectiveness of the self-supervised refinement strategy.

Critical Analysis

The paper provides a thorough evaluation of the ASLseg method and discusses several limitations and potential areas for future work:

The self-supervised refinement stage requires careful hyperparameter tuning to balance the contributions of the labeled and unlabeled data, which could be challenging in practice.
The method assumes that the unlabeled data is representative of the test distribution, which may not always be the case in real-world medical imaging scenarios.
The uncertainty estimation component could be further improved to provide more reliable guidance for human experts reviewing the segmentation results.
Extending the approach to handle multi-class segmentation or other types of medical imaging data would be an interesting direction for future research.

Overall, the ASLseg method represents a promising step towards more efficient and practical semi-supervised segmentation solutions for medical imaging applications. However, as with any research, there are still opportunities to refine and expand the techniques to address the remaining challenges.

Conclusion

The ASLseg paper presents a novel semi-supervised approach for liver tumor segmentation that adaptively refines a pre-trained Segment Anything Model (SAM) using a small set of labeled data and a larger set of unlabeled data. By leveraging self-supervised learning, the method can achieve competitive performance with significantly less annotation effort, a key requirement for practical deployment in medical imaging scenarios.

The work builds upon and combines ideas from previous research on adapting SAM for specific tasks and using self-supervision for medical image segmentation. The thorough evaluation and discussion of the method's limitations provide a solid foundation for future improvements and extensions to other medical imaging applications.

Overall, the ASLseg technique demonstrates the potential of semi-supervised learning to make medical image analysis more efficient and accessible, which could ultimately lead to better patient outcomes and more effective clinical decision-making.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

ASLseg: Adapting SAM in the Loop for Semi-supervised Liver Tumor Segmentation

Shiyun Chen, Li Lin, Pujin Cheng, Xiaoying Tang

Liver tumor segmentation is essential for computer-aided diagnosis, surgical planning, and prognosis evaluation. However, obtaining and maintaining a large-scale dataset with dense annotations is challenging. Semi-Supervised Learning (SSL) is a common technique to address these challenges. Recently, Segment Anything Model (SAM) has shown promising performance in some medical image segmentation tasks, but it performs poorly for liver tumor segmentation. In this paper, we propose a novel semi-supervised framework, named ASLseg, which can effectively adapt the SAM to the SSL setting and combine both domain-specific and general knowledge of liver tumors. Specifically, the segmentation model trained with a specific SSL paradigm provides the generated pseudo-labels as prompts to the fine-tuned SAM. An adaptation network is then used to refine the SAM-predictions and generate higher-quality pseudo-labels. Finally, the reliable pseudo-labels are selected to expand the labeled set for iterative training. Extensive experiments on the LiTS dataset demonstrate overwhelming performance of our ASLseg.

5/21/2024

Leveraging Task-Specific Knowledge from LLM for Semi-Supervised 3D Medical Image Segmentation

Suruchi Kumari, Aryan Das, Swalpa Kumar Roy, Indu Joshi, Pravendra Singh

Traditional supervised 3D medical image segmentation models need voxel-level annotations, which require huge human effort, time, and cost. Semi-supervised learning (SSL) addresses this limitation of supervised learning by facilitating learning with a limited annotated and larger amount of unannotated training samples. However, state-of-the-art SSL models still struggle to fully exploit the potential of learning from unannotated samples. To facilitate effective learning from unannotated data, we introduce LLM-SegNet, which exploits a large language model (LLM) to integrate task-specific knowledge into our co-training framework. This knowledge aids the model in comprehensively understanding the features of the region of interest (ROI), ultimately leading to more efficient segmentation. Additionally, to further reduce erroneous segmentation, we propose a Unified Segmentation loss function. This loss function reduces erroneous segmentation by not only prioritizing regions where the model is confident in predicting between foreground or background pixels but also effectively addressing areas where the model lacks high confidence in predictions. Experiments on publicly available Left Atrium, Pancreas-CT, and Brats-19 datasets demonstrate the superior performance of LLM-SegNet compared to the state-of-the-art. Furthermore, we conducted several ablation studies to demonstrate the effectiveness of various modules and loss functions leveraged by LLM-SegNet.

7/9/2024

Mask-Enhanced Segment Anything Model for Tumor Lesion Semantic Segmentation

Hairong Shi, Songhao Han, Shaofei Huang, Yue Liao, Guanbin Li, Xiangxing Kong, Hua Zhu, Xiaomu Wang, Si Liu

Tumor lesion segmentation on CT or MRI images plays a critical role in cancer diagnosis and treatment planning. Considering the inherent differences in tumor lesion segmentation data across various medical imaging modalities and equipment, integrating medical knowledge into the Segment Anything Model (SAM) presents promising capability due to its versatility and generalization potential. Recent studies have attempted to enhance SAM with medical expertise by pre-training on large-scale medical segmentation datasets. However, challenges still exist in 3D tumor lesion segmentation owing to tumor complexity and the imbalance in foreground and background regions. Therefore, we introduce Mask-Enhanced SAM (M-SAM), an innovative architecture tailored for 3D tumor lesion segmentation. We propose a novel Mask-Enhanced Adapter (MEA) within M-SAM that enriches the semantic information of medical images with positional data from coarse segmentation masks, facilitating the generation of more precise segmentation masks. Furthermore, an iterative refinement scheme is implemented in M-SAM to refine the segmentation masks progressively, leading to improved performance. Extensive experiments on seven tumor lesion segmentation datasets indicate that our M-SAM not only achieves high segmentation accuracy but also exhibits robust generalization. The code is available at https://github.com/nanase1025/M-SAM.

7/12/2024

Cross Prompting Consistency with Segment Anything Model for Semi-supervised Medical Image Segmentation

Juzheng Miao, Cheng Chen, Keli Zhang, Jie Chuai, Quanzheng Li, Pheng-Ann Heng

Semi-supervised learning (SSL) has achieved notable progress in medical image segmentation. To achieve effective SSL, a model needs to be able to efficiently learn from limited labeled data and effectively exploiting knowledge from abundant unlabeled data. Recent developments in visual foundation models, such as the Segment Anything Model (SAM), have demonstrated remarkable adaptability with improved sample efficiency. To harness the power of foundation models for application in SSL, we propose a cross prompting consistency method with segment anything model (CPC-SAM) for semi-supervised medical image segmentation. Our method employs SAM's unique prompt design and innovates a cross-prompting strategy within a dual-branch framework to automatically generate prompts and supervisions across two decoder branches, enabling effectively learning from both scarce labeled and valuable unlabeled data. We further design a novel prompt consistency regularization, to reduce the prompt position sensitivity and to enhance the output invariance under different prompts. We validate our method on two medical image segmentation tasks. The extensive experiments with different labeled-data ratios and modalities demonstrate the superiority of our proposed method over the state-of-the-art SSL methods, with more than 9% Dice improvement on the breast cancer segmentation task.

7/9/2024