DiffuX2CT: Diffusion Learning to Reconstruct CT Images from Biplanar X-Rays

Read original: arXiv:2407.13545 - Published 7/19/2024 by Xuhui Liu, Zhi Qiao, Runkun Liu, Hong Li, Juan Zhang, Xiantong Zhen, Zhen Qian, Baochang Zhang

DiffuX2CT: Diffusion Learning to Reconstruct CT Images from Biplanar X-Rays

Overview

The paper proposes a novel diffusion-based method called DiffuX2CT to reconstruct 3D CT images from biplanar X-ray images.
It introduces an implicit conditioning mechanism to effectively leverage the information from the X-ray images during the CT reconstruction process.
The model uses a tri-plane representation to capture the 3D structure of the anatomy, enabling high-quality CT reconstruction from only two X-ray views.

Plain English Explanation

This research paper introduces a new method called DiffuX2CT for reconstructing 3D computed tomography (CT) images from a pair of 2D X-ray images taken from different angles. The key innovation is an "implicit conditioning mechanism" that allows the system to effectively use the information from the X-ray images to guide the reconstruction of the 3D CT image.

Traditionally, reconstructing a 3D CT image from 2D X-ray images has been a challenging task. The new DiffuX2CT method uses a machine learning approach based on "diffusion models," which are a type of generative model that can create new images by gradually adding and then removing "noise" in a controlled way. By conditioning the diffusion process on the input X-ray images, the system is able to reconstruct high-quality 3D CT images that are consistent with the patient's anatomy.

Another important aspect of the DiffuX2CT model is its use of a "tri-plane representation" to capture the 3D structure of the anatomy. This allows the system to efficiently reconstruct the 3D CT image from just two 2D X-ray views, rather than requiring multiple X-ray images. This could be particularly useful in medical settings where minimizing patient exposure to radiation is important.

Technical Explanation

The key technical innovation in the DiffuX2CT model is the use of an "implicit conditioning mechanism" to leverage the information from the input X-ray images during the CT reconstruction process. This is achieved by modifying the diffusion model's noise-adding and noise-removal steps to be conditioned on the X-ray images.

Specifically, the model takes in a pair of biplanar X-ray images and uses a neural network to extract a latent representation that encodes the 3D anatomical structure. This latent representation is then used to condition the diffusion process, guiding the gradual reconstruction of the 3D CT image.

To efficiently capture the 3D structure from just two 2D views, the DiffuX2CT model uses a "tri-plane representation," which represents the 3D volume as three orthogonal 2D planes. This allows the model to learn a more compact and efficient representation of the 3D anatomy compared to a full 3D voxel grid.

The authors evaluate the DiffuX2CT model on a dataset of paired X-ray and CT images, demonstrating that it can reconstruct high-quality 3D CT images that are consistent with the input X-ray views. The model outperforms several baseline methods, including multi-view X-ray image synthesis and X-ray to CTPA generation approaches.

Critical Analysis

The DiffuX2CT paper presents a promising approach for reconstructing 3D CT images from biplanar X-ray data. The use of a diffusion-based model with an implicit conditioning mechanism is a novel and potentially powerful technique for leveraging the information contained in the X-ray images.

One limitation of the current work is that it has been evaluated only on a relatively small dataset of paired X-ray and CT images. While the results are encouraging, further validation on larger and more diverse datasets would be helpful to assess the generalizability of the approach. Additionally, the authors do not provide a detailed analysis of the limitations or failure modes of the DiffuX2CT model, which would be valuable for understanding its practical applicability.

Compared to other related methods, such as X-ray to CTPA generation and 3D rotation of radiographs, the DiffuX2CT approach appears to offer some unique advantages in terms of its ability to reconstruct high-quality 3D CT images from just two X-ray views. However, a more thorough comparison to these and other related techniques would help to better situate the contributions of this work.

Conclusion

The DiffuX2CT paper presents a novel diffusion-based method for reconstructing 3D CT images from biplanar X-ray data. The key innovations include an implicit conditioning mechanism to leverage the X-ray information and a tri-plane representation to efficiently capture the 3D anatomy from just two 2D views.

This research could have important implications for medical imaging, potentially enabling faster and more accessible CT scans by reducing the number of required X-ray views. Additionally, the diffusion-based approach may offer advantages in terms of generating anatomically consistent reconstructions compared to traditional methods.

While further validation and comparison to related techniques would be valuable, the DiffuX2CT model represents an exciting step forward in the field of 3D medical image reconstruction from limited 2D data.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

DiffuX2CT: Diffusion Learning to Reconstruct CT Images from Biplanar X-Rays

Xuhui Liu, Zhi Qiao, Runkun Liu, Hong Li, Juan Zhang, Xiantong Zhen, Zhen Qian, Baochang Zhang

Computed tomography (CT) is widely utilized in clinical settings because it delivers detailed 3D images of the human body. However, performing CT scans is not always feasible due to radiation exposure and limitations in certain surgical environments. As an alternative, reconstructing CT images from ultra-sparse X-rays offers a valuable solution and has gained significant interest in scientific research and medical applications. However, it presents great challenges as it is inherently an ill-posed problem, often compromised by artifacts resulting from overlapping structures in X-ray images. In this paper, we propose DiffuX2CT, which models CT reconstruction from orthogonal biplanar X-rays as a conditional diffusion process. DiffuX2CT is established with a 3D global coherence denoising model with a new, implicit conditioning mechanism. We realize the conditioning mechanism by a newly designed tri-plane decoupling generator and an implicit neural decoder. By doing so, DiffuX2CT achieves structure-controllable reconstruction, which enables 3D structural information to be recovered from 2D X-rays, therefore producing faithful textures in CT images. As an extra contribution, we collect a real-world lumbar CT dataset, called LumbarV, as a new benchmark to verify the clinical significance and performance of CT reconstruction from X-rays. Extensive experiments on this dataset and three more publicly available datasets demonstrate the effectiveness of our proposal.

7/19/2024

Diff2CT: Diffusion Learning to Reconstruct Spine CT from Biplanar X-Rays

Zhi Qiao, Xuhui Liu, Xiaopeng Wang, Runkun Liu, Xiantong Zhen, Pei Dong, Zhen Qian

Intraoperative CT imaging serves as a crucial resource for surgical guidance; however, it may not always be readily accessible or practical to implement. In scenarios where CT imaging is not an option, reconstructing CT scans from X-rays can offer a viable alternative. In this paper, we introduce an innovative method for 3D CT reconstruction utilizing biplanar X-rays. Distinct from previous research that relies on conventional image generation techniques, our approach leverages a conditional diffusion process to tackle the task of reconstruction. More precisely, we employ a diffusion-based probabilistic model trained to produce 3D CT images based on orthogonal biplanar X-rays. To improve the structural integrity of the reconstructed images, we incorporate a novel projection loss function. Experimental results validate that our proposed method surpasses existing state-of-the-art benchmarks in both visual image quality and multiple evaluative metrics. Specifically, our technique achieves a higher Structural Similarity Index (SSIM) of 0.83, a relative increase of 10%, and a lower Fr'echet Inception Distance (FID) of 83.43, which represents a relative decrease of 25%.

8/22/2024

New!DX2CT: Diffusion Model for 3D CT Reconstruction from Bi or Mono-planar 2D X-ray(s)

Yun Su Jeong, Hye Bin Yoo, Il Yong Chun

Computational tomography (CT) provides high-resolution medical imaging, but it can expose patients to high radiation. X-ray scanners have low radiation exposure, but their resolutions are low. This paper proposes a new conditional diffusion model, DX2CT, that reconstructs three-dimensional (3D) CT volumes from bi or mono-planar X-ray image(s). Proposed DX2CT consists of two key components: 1) modulating feature maps extracted from two-dimensional (2D) X-ray(s) with 3D positions of CT volume using a new transformer and 2) effectively using the modulated 3D position-aware feature maps as conditions of DX2CT. In particular, the proposed transformer can provide conditions with rich information of a target CT slice to the conditional diffusion model, enabling high-quality CT reconstruction. Our experiments with the bi or mono-planar X-ray(s) benchmark datasets show that proposed DX2CT outperforms several state-of-the-art methods. Our codes and model will be available at: https://www.github.com/intyeger/DX2CT.

9/16/2024

DIFR3CT: Latent Diffusion for Probabilistic 3D CT Reconstruction from Few Planar X-Rays

Yiran Sun, Hana Baroudi, Tucker Netherton, Laurence Court, Osama Mawlawi, Ashok Veeraraghavan, Guha Balakrishnan

Computed Tomography (CT) scans are the standard-of-care for the visualization and diagnosis of many clinical ailments, and are needed for the treatment planning of external beam radiotherapy. Unfortunately, the availability of CT scanners in low- and mid-resource settings is highly variable. Planar x-ray radiography units, in comparison, are far more prevalent, but can only provide limited 2D observations of the 3D anatomy. In this work we propose DIFR3CT, a 3D latent diffusion model, that can generate a distribution of plausible CT volumes from one or few (<10) planar x-ray observations. DIFR3CT works by fusing 2D features from each x-ray into a joint 3D space, and performing diffusion conditioned on these fused features in a low-dimensional latent space. We conduct extensive experiments demonstrating that DIFR3CT is better than recent sparse CT reconstruction baselines in terms of standard pixel-level (PSNR, SSIM) on both the public LIDC and in-house post-mastectomy CT datasets. We also show that DIFR3CT supports uncertainty quantification via Monte Carlo sampling, which provides an opportunity to measure reconstruction reliability. Finally, we perform a preliminary pilot study evaluating DIFR3CT for automated breast radiotherapy contouring and planning -- and demonstrate promising feasibility. Our code is available at https://github.com/yransun/DIFR3CT.

8/28/2024