ToNNO: Tomographic Reconstruction of a Neural Network's Output for Weakly Supervised Segmentation of 3D Medical Images

Read original: arXiv:2404.13103 - Published 4/23/2024 by Marius Schmidt-Mengin, Alexis Benichoux, Shibeshih Belachew, Nikos Komodakis, Nikos Paragios

ToNNO: Tomographic Reconstruction of a Neural Network's Output for Weakly Supervised Segmentation of 3D Medical Images

Overview

This research paper presents a novel method called ToNNO (Tomographic Reconstruction of a Neural Network's Output) for weakly supervised segmentation of 3D medical images.
The key idea is to leverage the outputs of a neural network trained on 2D slices of the 3D images to guide the segmentation of the full 3D volume.
This approach aims to address the challenge of obtaining fully annotated 3D medical images, which can be time-consuming and costly.

Plain English Explanation

The paper introduces a technique called ToNNO that can help automate the process of segmenting 3D medical images, such as CT or MRI scans. Segmentation is the task of identifying and delineating different anatomical structures or regions of interest within these 3D images.

Traditionally, segmenting 3D medical images requires having a large dataset of fully labeled 3D volumes, which can be very difficult and expensive to obtain. The ToNNO method instead uses a neural network trained on 2D slices of the 3D images to guide the segmentation of the full 3D volume.

The neural network is first trained on 2D slices of the 3D images, where some of the slices have been manually labeled. This allows the network to learn patterns and features that are useful for segmentation, even without having full 3D annotations. [link to "weakly-supervised-learning-via-multi-lateral-decoder"]

The key insight of ToNNO is that the outputs of this 2D neural network can be used to reconstruct a 3D segmentation of the full volume, without needing to label the entire 3D image. This is done through a process called tomographic reconstruction, which combines the 2D predictions into a coherent 3D segmentation.

By leveraging the 2D network's outputs in this way, ToNNO can perform 3D segmentation with much less manual annotation effort compared to traditional fully supervised methods. [link to "beyond-pixel-wise-supervision-medical-image-segmentation"]

Technical Explanation

The ToNNO method consists of two main steps:

Training a 2D neural network on annotated 2D slices of the 3D medical images. This network learns to predict segmentation maps for the 2D slices.
Applying the trained 2D network to all slices of the 3D image, and then using a tomographic reconstruction algorithm to combine the 2D predictions into a 3D segmentation of the full volume.

The key technical innovation is this tomographic reconstruction step, which allows the method to leverage the 2D network's outputs to segment the 3D image without requiring full 3D annotations. [link to "cv-attention-unet-attention-based-unet-3d"]

The authors evaluate ToNNO on several 3D medical image segmentation tasks, including brain MRI and abdominal CT scans. They show that ToNNO can achieve comparable or better performance compared to fully supervised 3D segmentation methods, while requiring significantly less manual annotation effort.

Critical Analysis

The paper provides a thorough evaluation of ToNNO's performance on multiple 3D medical imaging tasks. However, some potential limitations or areas for further research include:

The method still requires some manual 2D annotations, which may still be time-consuming to obtain in practice. Exploring ways to further reduce the annotation burden would be valuable. [link to "coin-counterfactual-inpainting-weakly-supervised-semantic-segmentation"]
The tomographic reconstruction step assumes that the 2D network predictions are reliable and consistent across slices. In practice, there may be artifacts or inconsistencies that could affect the quality of the 3D reconstruction.
The paper does not extensively explore the sensitivity of ToNNO to factors such as the number of 2D annotated slices or the architecture of the 2D neural network. Further analysis of these parameters could provide more insights.

Overall, ToNNO presents a promising approach to address the challenge of 3D medical image segmentation, and the paper makes a valuable contribution to the field of weakly supervised learning for medical imaging.

Conclusion

The ToNNO method proposed in this paper offers a new way to perform 3D medical image segmentation with significantly less manual annotation effort compared to traditional fully supervised approaches. By leveraging the outputs of a 2D neural network and combining them through tomographic reconstruction, ToNNO can achieve competitive segmentation performance without requiring full 3D annotations.

This work demonstrates the potential of weakly supervised learning techniques to enhance medical image analysis and could have important implications for streamlining the development of clinical AI systems. As the medical imaging field continues to grapple with the challenge of obtaining comprehensive annotated datasets, methods like ToNNO may become increasingly valuable tools in the arsenal of researchers and clinicians.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

ToNNO: Tomographic Reconstruction of a Neural Network's Output for Weakly Supervised Segmentation of 3D Medical Images

Marius Schmidt-Mengin, Alexis Benichoux, Shibeshih Belachew, Nikos Komodakis, Nikos Paragios

Annotating lots of 3D medical images for training segmentation models is time-consuming. The goal of weakly supervised semantic segmentation is to train segmentation models without using any ground truth segmentation masks. Our work addresses the case where only image-level categorical labels, indicating the presence or absence of a particular region of interest (such as tumours or lesions), are available. Most existing methods rely on class activation mapping (CAM). We propose a novel approach, ToNNO, which is based on the Tomographic reconstruction of a Neural Network's Output. Our technique extracts stacks of slices with different angles from the input 3D volume, feeds these slices to a 2D encoder, and applies the inverse Radon transform in order to reconstruct a 3D heatmap of the encoder's predictions. This generic method allows to perform dense prediction tasks on 3D volumes using any 2D image encoder. We apply it to weakly supervised medical image segmentation by training the 2D encoder to output high values for slices containing the regions of interest. We test it on four large scale medical image datasets and outperform 2D CAM methods. We then extend ToNNO by combining tomographic reconstruction with CAM methods, proposing Averaged CAM and Tomographic CAM, which obtain even better results.

4/23/2024

👨‍🏫

Generative Adversarial Networks for Weakly Supervised Generation and Evaluation of Brain Tumor Segmentations on MR Images

Jay J. Yoo, Khashayar Namdar, Matthias W. Wagner, Liana Nobre, Uri Tabori, Cynthia Hawkins, Birgit B. Ertl-Wagner, Farzad Khalvati

Segmentation of regions of interest (ROIs) for identifying abnormalities is a leading problem in medical imaging. Using machine learning for this problem generally requires manually annotated ground-truth segmentations, demanding extensive time and resources from radiologists. This work presents a weakly supervised approach that utilizes binary image-level labels, which are much simpler to acquire, to effectively segment anomalies in 2D magnetic resonance images without ground truth annotations. We train a generative adversarial network (GAN) that converts cancerous images to healthy variants, which are used along with localization seeds as priors to generate improved weakly supervised segmentations. The non-cancerous variants can also be used to evaluate the segmentations in a weakly supervised fashion, which allows for the most effective segmentations to be identified and then applied to downstream clinical classification tasks. On the Multimodal Brain Tumor Segmentation (BraTS) 2020 dataset, our proposed method generates and identifies segmentations that achieve test Dice coefficients of 83.91%. Using these segmentations for pathology classification results with a test AUC of 93.32% which is comparable to the test AUC of 95.80% achieved when using true segmentations.

8/19/2024

Semi-Supervised Segmentation via Embedding Matching

Weiyi Xie, Nathalie Willems, Nikolas Lessmann, Tom Gibbons, Daniele De Massari

Deep convolutional neural networks are widely used in medical image segmentation but require many labeled images for training. Annotating three-dimensional medical images is a time-consuming and costly process. To overcome this limitation, we propose a novel semi-supervised segmentation method that leverages mostly unlabeled images and a small set of labeled images in training. Our approach involves assessing prediction uncertainty to identify reliable predictions on unlabeled voxels from the teacher model. These voxels serve as pseudo-labels for training the student model. In voxels where the teacher model produces unreliable predictions, pseudo-labeling is carried out based on voxel-wise embedding correspondence using reference voxels from labeled images. We applied this method to automate hip bone segmentation in CT images, achieving notable results with just 4 CT scans. The proposed approach yielded a Hausdorff distance with 95th percentile (HD95) of 3.30 and IoU of 0.929, surpassing existing methods achieving HD95 (4.07) and IoU (0.927) at their best.

7/8/2024

Weakly Supervised Learning of Cortical Surface Reconstruction from Segmentations

Qiang Ma, Liu Li, Emma C. Robinson, Bernhard Kainz, Daniel Rueckert

Existing learning-based cortical surface reconstruction approaches heavily rely on the supervision of pseudo ground truth (pGT) cortical surfaces for training. Such pGT surfaces are generated by traditional neuroimage processing pipelines, which are time consuming and difficult to generalize well to low-resolution brain MRI, e.g., from fetuses and neonates. In this work, we present CoSeg, a learning-based cortical surface reconstruction framework weakly supervised by brain segmentations without the need for pGT surfaces. CoSeg introduces temporal attention networks to learn time-varying velocity fields from brain MRI for diffeomorphic surface deformations, which fit an initial surface to target cortical surfaces within only 0.11 seconds for each brain hemisphere. A weakly supervised loss is designed to reconstruct pial surfaces by inflating the white surface along the normal direction towards the boundary of the cortical gray matter segmentation. This alleviates partial volume effects and encourages the pial surface to deform into deep and challenging cortical sulci. We evaluate CoSeg on 1,113 adult brain MRI at 1mm and 2mm resolution. CoSeg achieves superior geometric and morphological accuracy compared to existing learning-based approaches. We also verify that CoSeg can extract high-quality cortical surfaces from fetal brain MRI on which traditional pipelines fail to produce acceptable results.

6/19/2024