CBCTLiTS: A Synthetic, Paired CBCT/CT Dataset For Segmentation And Style Transfer

Read original: arXiv:2407.14853 - Published 7/23/2024 by Maximilian E. Tschuchnig, Philipp Steininger, Michael Gadermayr

CBCTLiTS: A Synthetic, Paired CBCT/CT Dataset For Segmentation And Style Transfer

Overview

The paper presents CBCTLiTS, a synthetic dataset of paired cone-beam computed tomography (CBCT) and computed tomography (CT) images for medical image segmentation and style transfer.
The dataset aims to address the lack of large-scale, high-quality, and diverse CBCT/CT image pairs for developing and evaluating AI models in the medical imaging domain.
The authors describe the dataset creation process, including the generation of realistic synthetic CBCT images from real CT scans, and provide a baseline performance evaluation for common segmentation tasks.

Plain English Explanation

CBCTLiTS: A Synthetic, Paired CBCT/CT Dataset For Segmentation And Style Transfer is a research paper that introduces a new dataset of medical images. The dataset contains pairs of two types of 3D medical scans: cone-beam computed tomography (CBCT) and computed tomography (CT).

CBCT and CT are both imaging techniques used in healthcare, but they produce different types of images. CBCT scans are often used in dental and maxillofacial procedures, while CT scans are more commonly used for whole-body imaging. The challenge is that CBCT and CT scans can look quite different, even when capturing the same anatomy.

To address this, the researchers created a synthetic dataset that pairs realistic CBCT images with their corresponding CT scans. This allows AI models to be trained on CBCT-CT image pairs, which can improve their performance on tasks like medical image segmentation (identifying different anatomical structures in the scans) and style transfer (translating between CBCT and CT image appearances).

The key benefits of this dataset are:

It provides a large, high-quality collection of CBCT-CT pairs, which are otherwise difficult to obtain.
It enables the development and testing of AI models that can better handle the differences between CBCT and CT images.
It can help advance research in medical image analysis and cross-modality translation.

Overall, this dataset and the associated research could contribute to improving the accuracy and capabilities of AI systems used in healthcare, particularly for applications involving CBCT imaging.

Technical Explanation

The CBCTLiTS dataset was created by the authors to address the lack of large-scale, diverse, and high-quality paired CBCT and CT image data for training and evaluating AI models in medical imaging tasks. CBCT and CT scans provide complementary information, but their visual differences pose challenges for developing robust machine learning algorithms.

To generate the dataset, the authors first obtained a collection of real CT scans. They then used a deep learning-based technique to synthesize realistic CBCT images from the CT scans, preserving the underlying anatomy while introducing realistic CBCT-specific imaging artifacts and characteristics. This resulted in a dataset of 1,000 paired CBCT-CT volumes, which were split into training, validation, and testing sets.

The authors provide a baseline evaluation of the dataset's utility for two common medical image segmentation tasks: head and neck anatomy segmentation and tooth segmentation. They trained popular deep learning models (U-Net and TransUNet) on the CBCTLiTS dataset and compared their performance to models trained on real CBCT-CT pairs. The results demonstrate that the synthetic dataset can effectively substitute for real data, achieving comparable or even superior segmentation accuracy.

Additionally, the authors explore the potential of the CBCTLiTS dataset for cross-modality style transfer, where the goal is to translate CBCT images to look more like their corresponding CT counterparts. They present promising preliminary results using a conditional generative adversarial network (cGAN) model trained on the dataset.

Overall, the CBCTLiTS dataset and the associated baseline experiments showcase the potential of synthetic data to enable and accelerate the development of advanced AI models for medical imaging applications involving CBCT and CT modalities.

Critical Analysis

The CBCTLiTS dataset and the accompanying research provide a valuable contribution to the medical imaging community. The authors have recognized a significant challenge in the field – the lack of large-scale, diverse, and paired CBCT-CT data – and have taken steps to address it through the creation of a synthetic dataset.

One of the key strengths of this work is the rigorous approach to dataset generation. The authors have leveraged state-of-the-art deep learning techniques to synthesize realistic CBCT images from real CT scans, preserving the underlying anatomical structure while introducing the characteristic CBCT imaging artifacts. This enables the creation of a large-scale dataset that can serve as a valuable resource for researchers and developers working on medical image analysis tasks.

The baseline experiments on segmentation and style transfer tasks demonstrate the utility of the CBCTLiTS dataset. The results suggest that AI models trained on the synthetic data can achieve comparable or even superior performance compared to models trained on real CBCT-CT pairs. This is a promising indication that synthetic data can effectively substitute for scarce real-world data, potentially accelerating the development of advanced medical imaging applications.

However, it is important to note that the authors acknowledge certain limitations of the dataset. For example, the synthetic CBCT images may not fully capture the variability and complexity of real-world CBCT scans, which can be influenced by factors such as patient anatomy, imaging equipment, and acquisition protocols. Additionally, the authors mention the need for further validation and testing of the dataset's performance on a wider range of medical imaging tasks and applications.

As researchers continue to explore the potential of synthetic data in the medical imaging domain, it will be crucial to carefully evaluate the limitations and potential biases inherent in these synthetic datasets. Ongoing efforts to validate the performance and generalizability of AI models trained on synthetic data, as well as comparisons to models trained on real-world data, will be essential for ensuring the reliable and trustworthy deployment of these technologies in clinical settings.

Conclusion

The CBCTLiTS dataset presented in this paper represents an important step forward in addressing the data scarcity challenge in the medical imaging domain. By leveraging deep learning techniques to synthesize realistic CBCT images paired with their corresponding CT scans, the authors have created a valuable resource for researchers and developers working on a wide range of medical image analysis tasks.

The baseline experiments on segmentation and style transfer tasks demonstrate the potential of this synthetic dataset to enable the development of robust and accurate AI models, even in the absence of large-scale real-world CBCT-CT data. This could have significant implications for advancing medical imaging applications, particularly those involving CBCT modalities, which are increasingly being used in various healthcare settings.

As the field of medical AI continues to evolve, the availability of high-quality, diverse, and well-curated datasets will be crucial for driving innovation and ensuring the reliability and safety of these technologies. The CBCTLiTS dataset, and similar efforts to create synthetic medical data, represent an important step in this direction, paving the way for more accurate, efficient, and accessible medical imaging solutions.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

CBCTLiTS: A Synthetic, Paired CBCT/CT Dataset For Segmentation And Style Transfer

Maximilian E. Tschuchnig, Philipp Steininger, Michael Gadermayr

Medical imaging is vital in computer assisted intervention. Particularly cone beam computed tomography (CBCT) with defacto real time and mobility capabilities plays an important role. However, CBCT images often suffer from artifacts, which pose challenges for accurate interpretation, motivating research in advanced algorithms for more effective use in clinical practice. In this work we present CBCTLiTS, a synthetically generated, labelled CBCT dataset for segmentation with paired and aligned, high quality computed tomography data. The CBCT data is provided in 5 different levels of quality, reaching from a large number of projections with high visual quality and mild artifacts to a small number of projections with severe artifacts. This allows thorough investigations with the quality as a degree of freedom. We also provide baselines for several possible research scenarios like uni- and multimodal segmentation, multitask learning and style transfer followed by segmentation of relatively simple, liver to complex liver tumor segmentation. CBCTLiTS is accesssible via https://www.kaggle.com/datasets/maximiliantschuchnig/cbct-liver-and-liver-tumor-segmentation-train-data.

7/23/2024

Multimodal Learning To Improve Segmentation With Intraoperative CBCT & Preoperative CT

Maximilian E. Tschuchnig, Philipp Steininger, Michael Gadermayr

Cone-beam computed tomography (CBCT) is an important tool facilitating computer aided interventions, despite often suffering from artifacts that pose challenges for accurate interpretation. While the degraded image quality can affect downstream segmentation, the availability of high quality, preoperative scans represents potential for improvements. Here we consider a setting where preoperative CT and intraoperative CBCT scans are available, however, the alignment (registration) between the scans is imperfect. We propose a multimodal learning method that fuses roughly aligned CBCT and CT scans and investigate the effect of CBCT quality and misalignment on the final segmentation performance. For that purpose, we make use of a synthetically generated data set containing real CT and synthetic CBCT volumes. As an application scenario, we focus on liver and liver tumor segmentation. We show that the fusion of preoperative CT and simulated, intraoperative CBCT mostly improves segmentation performance (compared to using intraoperative CBCT only) and that even clearly misaligned preoperative data has the potential to improve segmentation performance.

7/2/2024

🖼️

New!Improving Cone-Beam CT Image Quality with Knowledge Distillation-Enhanced Diffusion Model in Imbalanced Data Settings

Joonil Hwang, Sangjoon Park, NaHyeon Park, Seungryong Cho, Jin Sung Kim

In radiation therapy (RT), the reliance on pre-treatment computed tomography (CT) images encounter challenges due to anatomical changes, necessitating adaptive planning. Daily cone-beam CT (CBCT) imaging, pivotal for therapy adjustment, falls short in tissue density accuracy. To address this, our innovative approach integrates diffusion models for CT image generation, offering precise control over data synthesis. Leveraging a self-training method with knowledge distillation, we maximize CBCT data during therapy, complemented by sparse paired fan-beam CTs. This strategy, incorporated into state-of-the-art diffusion-based models, surpasses conventional methods like Pix2pix and CycleGAN. A meticulously curated dataset of 2800 paired CBCT and CT scans, supplemented by 4200 CBCT scans, undergoes preprocessing and teacher model training, including the Brownian Bridge Diffusion Model (BBDM). Pseudo-label CT images are generated, resulting in a dataset combining 5600 CT images with corresponding CBCT images. Thorough evaluation using MSE, SSIM, PSNR and LPIPS demonstrates superior performance against Pix2pix and CycleGAN. Our approach shows promise in generating high-quality CT images from CBCT scans in RT.

9/20/2024

A Multi-Stage Framework for 3D Individual Tooth Segmentation in Dental CBCT

Chunshi Wang, Bin Zhao, Shuxue Ding

Cone beam computed tomography (CBCT) is a common way of diagnosing dental related diseases. Accurate segmentation of 3D tooth is of importance for the treatment. Although deep learning based methods have achieved convincing results in medical image processing, they need a large of annotated data for network training, making it very time-consuming in data collection and annotation. Besides, domain shift widely existing in the distribution of data acquired by different devices impacts severely the model generalization. To resolve the problem, we propose a multi-stage framework for 3D tooth segmentation in dental CBCT, which achieves the third place in the Semi-supervised Teeth Segmentation 3D (STS-3D) challenge. The experiments on validation set compared with other semi-supervised segmentation methods further indicate the validity of our approach.

7/16/2024