Diff2CT: Diffusion Learning to Reconstruct Spine CT from Biplanar X-Rays

Read original: arXiv:2408.09731 - Published 8/22/2024 by Zhi Qiao, Xuhui Liu, Xiaopeng Wang, Runkun Liu, Xiantong Zhen, Pei Dong, Zhen Qian

Diff2CT: Diffusion Learning to Reconstruct Spine CT from Biplanar X-Rays

Overview

This paper proposes a diffusion-based deep learning model called "Diff2CT" to reconstruct high-quality 3D CT images from biplanar X-ray inputs.
The model leverages the strong representation power of diffusion models to effectively learn the complex mapping between 2D X-rays and 3D CT volumes.
The authors demonstrate that Diff2CT outperforms existing methods for CT reconstruction from biplanar X-rays on various evaluation metrics.

Plain English Explanation

The paper introduces a new diffusion model called "Diff2CT" that can take 2D X-ray images as input and generate high-quality 3D CT images as output. This is a useful capability because obtaining CT scans can be expensive and time-consuming, whereas X-rays are more readily available and cheaper.

The key idea behind Diff2CT is to leverage the powerful representation learning capabilities of diffusion models to effectively map the 2D X-ray data to the corresponding 3D CT volume. Diffusion models work by gradually adding noise to an image and then learning to reverse this process, allowing them to capture complex relationships in the data.

By applying this diffusion-based approach, the authors show that Diff2CT can outperform existing methods for reconstructing CT images from biplanar (two-view) X-rays. This could have important applications in medical imaging, where Diff2CT could help clinicians obtain 3D CT-like information from more readily available and cost-effective X-ray scans.

Technical Explanation

The Diff2CT model consists of an encoder network that takes biplanar X-ray images as input and generates a latent representation. This latent representation is then passed through a diffusion-based decoder network that progressively refines the 3D CT reconstruction.

The key innovation of Diff2CT is the use of a diffusion model for the decoder, which allows the network to effectively capture the complex relationship between the 2D X-ray inputs and the corresponding 3D CT volumes. The diffusion process involves gradually adding noise to the input data and then learning to reverse this process, enabling the model to learn a powerful generative representation.

The authors evaluate Diff2CT on several benchmark datasets for CT reconstruction from biplanar X-rays and show that it outperforms existing methods in terms of various metrics, such as peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM). This demonstrates the effectiveness of the diffusion-based approach for this task.

Critical Analysis

The paper provides a compelling demonstration of the potential of diffusion models for medical imaging tasks, such as reconstructing 3D CT volumes from 2D X-ray inputs. However, the authors acknowledge several limitations and areas for further research:

The current implementation of Diff2CT is trained and evaluated on a limited set of anatomical regions, such as the spine. Further research is needed to extend the approach to a broader range of body parts and medical conditions.
The training and inference time of Diff2CT may be a concern, as diffusion models can be computationally intensive. Optimizations or alternative architectures could be explored to address this.
The paper does not provide a detailed analysis of the model's robustness to variations in X-ray imaging conditions, such as different acquisition angles or noise levels. Evaluating the model's performance in more diverse real-world scenarios would be valuable.

Overall, the Diff2CT approach represents an interesting and promising direction for leveraging the power of diffusion models in medical imaging applications. Further research and development in this area could lead to significant advancements in areas like 3D CT reconstruction from 2D images, with potential benefits for patient care and clinical decision-making.

Conclusion

This paper introduces Diff2CT, a diffusion-based deep learning model for reconstructing high-quality 3D CT images from biplanar X-ray inputs. By harnessing the strong representation learning capabilities of diffusion models, the authors demonstrate that Diff2CT can outperform existing methods for this task, highlighting the potential of this approach for medical imaging applications.

While the current implementation has some limitations, the Diff2CT model represents an exciting advancement in the field of CT reconstruction from 2D X-rays. Continued research and development in this area could lead to more accessible and cost-effective 3D medical imaging solutions, ultimately benefiting patients and healthcare professionals.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Diff2CT: Diffusion Learning to Reconstruct Spine CT from Biplanar X-Rays

Zhi Qiao, Xuhui Liu, Xiaopeng Wang, Runkun Liu, Xiantong Zhen, Pei Dong, Zhen Qian

Intraoperative CT imaging serves as a crucial resource for surgical guidance; however, it may not always be readily accessible or practical to implement. In scenarios where CT imaging is not an option, reconstructing CT scans from X-rays can offer a viable alternative. In this paper, we introduce an innovative method for 3D CT reconstruction utilizing biplanar X-rays. Distinct from previous research that relies on conventional image generation techniques, our approach leverages a conditional diffusion process to tackle the task of reconstruction. More precisely, we employ a diffusion-based probabilistic model trained to produce 3D CT images based on orthogonal biplanar X-rays. To improve the structural integrity of the reconstructed images, we incorporate a novel projection loss function. Experimental results validate that our proposed method surpasses existing state-of-the-art benchmarks in both visual image quality and multiple evaluative metrics. Specifically, our technique achieves a higher Structural Similarity Index (SSIM) of 0.83, a relative increase of 10%, and a lower Fr'echet Inception Distance (FID) of 83.43, which represents a relative decrease of 25%.

8/22/2024

DiffuX2CT: Diffusion Learning to Reconstruct CT Images from Biplanar X-Rays

Xuhui Liu, Zhi Qiao, Runkun Liu, Hong Li, Juan Zhang, Xiantong Zhen, Zhen Qian, Baochang Zhang

Computed tomography (CT) is widely utilized in clinical settings because it delivers detailed 3D images of the human body. However, performing CT scans is not always feasible due to radiation exposure and limitations in certain surgical environments. As an alternative, reconstructing CT images from ultra-sparse X-rays offers a valuable solution and has gained significant interest in scientific research and medical applications. However, it presents great challenges as it is inherently an ill-posed problem, often compromised by artifacts resulting from overlapping structures in X-ray images. In this paper, we propose DiffuX2CT, which models CT reconstruction from orthogonal biplanar X-rays as a conditional diffusion process. DiffuX2CT is established with a 3D global coherence denoising model with a new, implicit conditioning mechanism. We realize the conditioning mechanism by a newly designed tri-plane decoupling generator and an implicit neural decoder. By doing so, DiffuX2CT achieves structure-controllable reconstruction, which enables 3D structural information to be recovered from 2D X-rays, therefore producing faithful textures in CT images. As an extra contribution, we collect a real-world lumbar CT dataset, called LumbarV, as a new benchmark to verify the clinical significance and performance of CT reconstruction from X-rays. Extensive experiments on this dataset and three more publicly available datasets demonstrate the effectiveness of our proposal.

7/19/2024

New!DX2CT: Diffusion Model for 3D CT Reconstruction from Bi or Mono-planar 2D X-ray(s)

Yun Su Jeong, Hye Bin Yoo, Il Yong Chun

Computational tomography (CT) provides high-resolution medical imaging, but it can expose patients to high radiation. X-ray scanners have low radiation exposure, but their resolutions are low. This paper proposes a new conditional diffusion model, DX2CT, that reconstructs three-dimensional (3D) CT volumes from bi or mono-planar X-ray image(s). Proposed DX2CT consists of two key components: 1) modulating feature maps extracted from two-dimensional (2D) X-ray(s) with 3D positions of CT volume using a new transformer and 2) effectively using the modulated 3D position-aware feature maps as conditions of DX2CT. In particular, the proposed transformer can provide conditions with rich information of a target CT slice to the conditional diffusion model, enabling high-quality CT reconstruction. Our experiments with the bi or mono-planar X-ray(s) benchmark datasets show that proposed DX2CT outperforms several state-of-the-art methods. Our codes and model will be available at: https://www.github.com/intyeger/DX2CT.

9/16/2024

DIFR3CT: Latent Diffusion for Probabilistic 3D CT Reconstruction from Few Planar X-Rays

Yiran Sun, Hana Baroudi, Tucker Netherton, Laurence Court, Osama Mawlawi, Ashok Veeraraghavan, Guha Balakrishnan

Computed Tomography (CT) scans are the standard-of-care for the visualization and diagnosis of many clinical ailments, and are needed for the treatment planning of external beam radiotherapy. Unfortunately, the availability of CT scanners in low- and mid-resource settings is highly variable. Planar x-ray radiography units, in comparison, are far more prevalent, but can only provide limited 2D observations of the 3D anatomy. In this work we propose DIFR3CT, a 3D latent diffusion model, that can generate a distribution of plausible CT volumes from one or few (<10) planar x-ray observations. DIFR3CT works by fusing 2D features from each x-ray into a joint 3D space, and performing diffusion conditioned on these fused features in a low-dimensional latent space. We conduct extensive experiments demonstrating that DIFR3CT is better than recent sparse CT reconstruction baselines in terms of standard pixel-level (PSNR, SSIM) on both the public LIDC and in-house post-mastectomy CT datasets. We also show that DIFR3CT supports uncertainty quantification via Monte Carlo sampling, which provides an opportunity to measure reconstruction reliability. Finally, we perform a preliminary pilot study evaluating DIFR3CT for automated breast radiotherapy contouring and planning -- and demonstrate promising feasibility. Our code is available at https://github.com/yransun/DIFR3CT.

8/28/2024