SAM Fewshot Finetuning for Anatomical Segmentation in Medical Images

Read original: arXiv:2407.04651 - Published 7/8/2024 by Weiyi Xie, Nathalie Willems, Shubham Patil, Yang Li, Mayank Kumar

SAM Fewshot Finetuning for Anatomical Segmentation in Medical Images

Overview

The paper explores a new approach called SAM Fewshot Finetuning for anatomical segmentation in medical images.
It aims to leverage the powerful Segment Anything Model (SAM) to improve performance on medical image segmentation tasks with limited training data.
The key idea is to fine-tune the pre-trained SAM model on a small number of annotated medical images to adapt it to the target domain.

Plain English Explanation

The researchers wanted to use a powerful AI model called the Segment Anything Model (SAM) to help with a medical imaging task - segmenting different anatomical structures in medical scans. The challenge is that medical datasets are often quite small, making it difficult to train AI models effectively.

The researchers' approach, called "SAM Fewshot Finetuning", involves taking the pre-trained SAM model and fine-tuning it on a small number of annotated medical images. This allows the model to adapt to the specific characteristics of medical images, while still leveraging the general capabilities it learned from being trained on a large and diverse dataset.

By fine-tuning the SAM model in this way, the researchers were able to achieve strong performance on anatomical segmentation tasks, even when only a small amount of training data was available. This could be very useful for medical applications, where getting large, high-quality datasets can be quite challenging.

Technical Explanation

The key technical elements of the SAM Fewshot Finetuning approach are:

Leveraging the Segment Anything Model (SAM): SAM is a large, pre-trained computer vision model that has shown impressive capabilities for segmenting a wide variety of objects in general images. The researchers hypothesized that this model could be adapted to perform well on medical image segmentation tasks.
Few-shot Fine-tuning: Instead of training the SAM model from scratch on a small medical dataset, the researchers fine-tuned the pre-trained model using only a few annotated medical images. This allowed the model to specialize to the target domain without requiring a large training set.
Anatomical Segmentation: The researchers evaluated their approach on the task of segmenting various anatomical structures (e.g. organs, bones) in medical images, such as CT scans and MRI images. This is an important task for clinical applications like diagnosis and surgical planning.
Experimental Evaluation: The researchers compared their SAM Fewshot Finetuning approach to other few-shot and zero-shot medical image segmentation methods. They found that their approach achieved state-of-the-art performance, particularly when only a small amount of training data was available.

Critical Analysis

The paper presents a novel and promising approach for leveraging large, general-purpose computer vision models like SAM to tackle medical image analysis tasks. The key strength of this work is its ability to achieve strong segmentation performance using only a small number of annotated medical images, which can be valuable in many real-world medical applications.

However, the paper does not address some potential limitations and caveats of the approach:

The performance of the fine-tuned SAM model may still be sensitive to the specific characteristics of the medical dataset used for fine-tuning. More research is needed to understand the generalization capabilities of this approach across diverse medical imaging modalities and anatomical structures.
The paper does not provide a detailed analysis of the computational and memory requirements of the fine-tuned SAM model, which could be an important practical consideration for deployment in clinical settings.
The paper focuses on segmentation tasks, but it would be interesting to see how the SAM Fewshot Finetuning approach could be extended to other medical image analysis tasks, such as disease classification or anomaly detection.

Overall, this work represents an important step forward in leveraging large-scale computer vision models for medical image analysis, and the researchers have identified a promising direction for further exploration and development.

Conclusion

The SAM Fewshot Finetuning approach presented in this paper demonstrates the potential of adapting powerful general-purpose computer vision models to the medical imaging domain, even when training data is limited. By fine-tuning the pre-trained Segment Anything Model on a small number of annotated medical images, the researchers were able to achieve state-of-the-art performance on anatomical segmentation tasks.

This work highlights the value of transfer learning and domain adaptation techniques in medical image analysis, where acquiring large, high-quality datasets can be challenging. If further developed and deployed, the SAM Fewshot Finetuning approach could help improve the accessibility and accuracy of medical image analysis tools, with potential benefits for clinical diagnosis, treatment planning, and patient outcomes.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

SAM Fewshot Finetuning for Anatomical Segmentation in Medical Images

Weiyi Xie, Nathalie Willems, Shubham Patil, Yang Li, Mayank Kumar

We propose a straightforward yet highly effective few-shot fine-tuning strategy for adapting the Segment Anything (SAM) to anatomical segmentation tasks in medical images. Our novel approach revolves around reformulating the mask decoder within SAM, leveraging few-shot embeddings derived from a limited set of labeled images (few-shot collection) as prompts for querying anatomical objects captured in image embeddings. This innovative reformulation greatly reduces the need for time-consuming online user interactions for labeling volumetric images, such as exhaustively marking points and bounding boxes to provide prompts slice by slice. With our method, users can manually segment a few 2D slices offline, and the embeddings of these annotated image regions serve as effective prompts for online segmentation tasks. Our method prioritizes the efficiency of the fine-tuning process by exclusively training the mask decoder through caching mechanisms while keeping the image encoder frozen. Importantly, this approach is not limited to volumetric medical images, but can generically be applied to any 2D/3D segmentation task. To thoroughly evaluate our method, we conducted extensive validation on four datasets, covering six anatomical segmentation tasks across two modalities. Furthermore, we conducted a comparative analysis of different prompting options within SAM and the fully-supervised nnU-Net. The results demonstrate the superior performance of our method compared to SAM employing only point prompts (approximately 50% improvement in IoU) and performs on-par with fully supervised methods whilst reducing the requirement of labeled data by at least an order of magnitude.

7/8/2024

S-SAM: SVD-based Fine-Tuning of Segment Anything Model for Medical Image Segmentation

Jay N. Paranjape, Shameema Sikder, S. Swaroop Vedula, Vishal M. Patel

Medical image segmentation has been traditionally approached by training or fine-tuning the entire model to cater to any new modality or dataset. However, this approach often requires tuning a large number of parameters during training. With the introduction of the Segment Anything Model (SAM) for prompted segmentation of natural images, many efforts have been made towards adapting it efficiently for medical imaging, thus reducing the training time and resources. However, these methods still require expert annotations for every image in the form of point prompts or bounding box prompts during training and inference, making it tedious to employ them in practice. In this paper, we propose an adaptation technique, called S-SAM, that only trains parameters equal to 0.4% of SAM's parameters and at the same time uses simply the label names as prompts for producing precise masks. This not only makes tuning SAM more efficient than the existing adaptation methods but also removes the burden of providing expert prompts. We call this modified version S-SAM and evaluate it on five different modalities including endoscopic images, x-ray, ultrasound, CT, and histology images. Our experiments show that S-SAM outperforms state-of-the-art methods as well as existing SAM adaptation methods while tuning a significantly less number of parameters. We release the code for S-SAM at https://github.com/JayParanjape/SVDSAM.

8/14/2024

FS-MedSAM2: Exploring the Potential of SAM2 for Few-Shot Medical Image Segmentation without Fine-tuning

Yunhao Bai, Qinji Yu, Boxiang Yun, Dakai Jin, Yingda Xia, Yan Wang

The Segment Anything Model 2 (SAM2) has recently demonstrated exceptional performance in zero-shot prompt segmentation for natural images and videos. However, it faces significant challenges when applied to medical images. Since its release, many attempts have been made to adapt SAM2's segmentation capabilities to the medical imaging domain. These efforts typically involve using a substantial amount of labeled data to fine-tune the model's weights. In this paper, we explore SAM2 from a different perspective via making the full use of its trained memory attention module and its ability of processing mask prompts. We introduce FS-MedSAM2, a simple yet effective framework that enables SAM2 to achieve superior medical image segmentation in a few-shot setting, without the need for fine-tuning. Our framework outperforms the current state-of-the-arts on two publicly available medical image datasets. The code is available at https://github.com/DeepMed-Lab-ECNU/FS_MedSAM2.

9/9/2024

ProtoSAM - One Shot Medical Image Segmentation With Foundational Models

Lev Ayzenberg, Raja Giryes, Hayit Greenspan

This work introduces a new framework, ProtoSAM, for one-shot medical image segmentation. It combines the use of prototypical networks, known for few-shot segmentation, with SAM - a natural image foundation model. The method proposed creates an initial coarse segmentation mask using the ALPnet prototypical network, augmented with a DINOv2 encoder. Following the extraction of an initial mask, prompts are extracted, such as points and bounding boxes, which are then input into the Segment Anything Model (SAM). State-of-the-art results are shown on several medical image datasets and demonstrate automated segmentation capabilities using a single image example (one shot) with no need for fine-tuning of the foundation model. Our code is available at: https://github.com/levayz/ProtoSAM

7/19/2024