Learning 3D Gaussians for Extremely Sparse-View Cone-Beam CT Reconstruction

Read original: arXiv:2407.01090 - Published 7/9/2024 by Yiqun Lin, Hualiang Wang, Jixiang Chen, Xiaomeng Li

Learning 3D Gaussians for Extremely Sparse-View Cone-Beam CT Reconstruction

Overview

This paper presents a novel approach for reconstructing cone-beam CT (CBCT) images from extremely sparse-view data, which is important for reducing patient radiation exposure in medical imaging.
The method involves learning 3D Gaussian distributions to represent the CT volume, and using a neural network to predict the Gaussian parameters from sparse-view projection data.
The authors demonstrate that their approach, called Gaussian Splatting, outperforms traditional sparse-view CT reconstruction methods in terms of image quality and computational efficiency.

Plain English Explanation

Cone-beam CT (CBCT) is a type of medical imaging that uses x-rays to create 3D images of the inside of the body. However, to reduce the amount of radiation that patients are exposed to, doctors often use fewer x-ray views, which can lead to poor image quality.

The researchers in this paper have developed a new way to reconstruct high-quality 3D CBCT images from these sparse, low-dose x-ray views. Their key insight is to represent the CT volume as a collection of 3D Gaussian distributions, rather than as a grid of pixel values. [A Gaussian distribution is a bell-shaped curve that is commonly used in statistics and machine learning.]

The researchers train a neural network to predict the parameters (size, location, and orientation) of these 3D Gaussians directly from the sparse x-ray projection data. By "splatting" or smearing out these Gaussian shapes onto a 3D grid, they are able to reconstruct a high-quality CT image using far fewer x-ray views than traditional methods.

This Gaussian Splatting approach has several advantages. It is more computationally efficient than other sparse-view reconstruction methods, and it can produce images with better visual quality and anatomical detail. The authors also show that their method is robust to variations in the sparsity of the input x-ray data.

Technical Explanation

The core of the Gaussian Splatting approach is to represent the 3D CT volume as a collection of 3D Gaussian distributions, rather than as a grid of pixel values. The neural network is trained to predict the parameters (mean, covariance, and weight) of these Gaussians directly from the sparse-view projection data.

During the reconstruction process, the predicted Gaussian parameters are "splatted" or smeared out onto a 3D grid, creating a dense, high-quality CT image. This Gaussian Splatting process is more computationally efficient than traditional sparse-view reconstruction methods, which typically involve iterative optimization.

The authors also introduce several other innovations, including:

A dual-domain unfolding architecture that leverages both the projection data and the reconstructed volume to improve performance.
A cross-regional, cross-view learning approach that enables the network to generalize to different anatomical regions and imaging setups.
A Gaussian rectifying technique that improves the quality of the splatted Gaussians.

Through extensive experiments, the authors demonstrate that their Gaussian Splatting approach outperforms traditional sparse-view CT reconstruction methods in terms of both image quality and computational efficiency.

Critical Analysis

The authors acknowledge several limitations of their approach, including the need for a relatively large dataset of paired projection and ground-truth CT data for training, and the potential for artifacts or inaccuracies in the reconstructed images, especially in regions with complex anatomy.

Additionally, while the Gaussian Splatting method shows promising results on the specific datasets and imaging setups tested, further research is needed to evaluate its generalization to a wider range of clinical scenarios, such as different anatomical regions, imaging protocols, and levels of sparsity.

It would also be interesting to see how the Gaussian Splatting approach compares to other recent sparse-view CT reconstruction techniques that leverage deep learning and implicit neural representations.

Conclusion

This paper presents a novel approach called Gaussian Splatting for reconstructing high-quality cone-beam CT images from extremely sparse-view projection data. By representing the CT volume as a collection of 3D Gaussian distributions and using a neural network to predict the Gaussian parameters, the authors demonstrate significant improvements in image quality and computational efficiency compared to traditional sparse-view reconstruction methods.

The Gaussian Splatting approach has the potential to enable low-dose CBCT imaging, which could reduce the radiation exposure for patients undergoing various medical procedures. However, further research is needed to fully evaluate the method's robustness and generalization to different clinical scenarios.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Learning 3D Gaussians for Extremely Sparse-View Cone-Beam CT Reconstruction

Yiqun Lin, Hualiang Wang, Jixiang Chen, Xiaomeng Li

Cone-Beam Computed Tomography (CBCT) is an indispensable technique in medical imaging, yet the associated radiation exposure raises concerns in clinical practice. To mitigate these risks, sparse-view reconstruction has emerged as an essential research direction, aiming to reduce the radiation dose by utilizing fewer projections for CT reconstruction. Although implicit neural representations have been introduced for sparse-view CBCT reconstruction, existing methods primarily focus on local 2D features queried from sparse projections, which is insufficient to process the more complicated anatomical structures, such as the chest. To this end, we propose a novel reconstruction framework, namely DIF-Gaussian, which leverages 3D Gaussians to represent the feature distribution in the 3D space, offering additional 3D spatial information to facilitate the estimation of attenuation coefficients. Furthermore, we incorporate test-time optimization during inference to further improve the generalization capability of the model. We evaluate DIF-Gaussian on two public datasets, showing significantly superior reconstruction performance than previous state-of-the-art methods.

7/9/2024

C^2RV: Cross-Regional and Cross-View Learning for Sparse-View CBCT Reconstruction

Yiqun Lin, Jiewen Yang, Hualiang Wang, Xinpeng Ding, Wei Zhao, Xiaomeng Li

Cone beam computed tomography (CBCT) is an important imaging technology widely used in medical scenarios, such as diagnosis and preoperative planning. Using fewer projection views to reconstruct CT, also known as sparse-view reconstruction, can reduce ionizing radiation and further benefit interventional radiology. Compared with sparse-view reconstruction for traditional parallel/fan-beam CT, CBCT reconstruction is more challenging due to the increased dimensionality caused by the measurement process based on cone-shaped X-ray beams. As a 2D-to-3D reconstruction problem, although implicit neural representations have been introduced to enable efficient training, only local features are considered and different views are processed equally in previous works, resulting in spatial inconsistency and poor performance on complicated anatomies. To this end, we propose C^2RV by leveraging explicit multi-scale volumetric representations to enable cross-regional learning in the 3D space. Additionally, the scale-view cross-attention module is introduced to adaptively aggregate multi-scale and multi-view features. Extensive experiments demonstrate that our C^2RV achieves consistent and significant improvement over previous state-of-the-art methods on datasets with diverse anatomy.

6/7/2024

GaSpCT: Gaussian Splatting for Novel CT Projection View Synthesis

Emmanouil Nikolakakis, Utkarsh Gupta, Jonathan Vengosh, Justin Bui, Razvan Marinescu

We present GaSpCT, a novel view synthesis and 3D scene representation method used to generate novel projection views for Computer Tomography (CT) scans. We adapt the Gaussian Splatting framework to enable novel view synthesis in CT based on limited sets of 2D image projections and without the need for Structure from Motion (SfM) methodologies. Therefore, we reduce the total scanning duration and the amount of radiation dose the patient receives during the scan. We adapted the loss function to our use-case by encouraging a stronger background and foreground distinction using two sparsity promoting regularizers: a beta loss and a total variation (TV) loss. Finally, we initialize the Gaussian locations across the 3D space using a uniform prior distribution of where the brain's positioning would be expected to be within the field of view. We evaluate the performance of our model using brain CT scans from the Parkinson's Progression Markers Initiative (PPMI) dataset and demonstrate that the rendered novel views closely match the original projection views of the simulated scan, and have better performance than other implicit 3D scene representations methodologies. Furthermore, we empirically observe reduced training time compared to neural network based image synthesis for sparse-view CT image reconstruction. Finally, the memory requirements of the Gaussian Splatting representations are reduced by 17% compared to the equivalent voxel grid image representations.

4/5/2024

LM-Gaussian: Boost Sparse-view 3D Gaussian Splatting with Large Model Priors

Hanyang Yu, Xiaoxiao Long, Ping Tan

We aim to address sparse-view reconstruction of a 3D scene by leveraging priors from large-scale vision models. While recent advancements such as 3D Gaussian Splatting (3DGS) have demonstrated remarkable successes in 3D reconstruction, these methods typically necessitate hundreds of input images that densely capture the underlying scene, making them time-consuming and impractical for real-world applications. However, sparse-view reconstruction is inherently ill-posed and under-constrained, often resulting in inferior and incomplete outcomes. This is due to issues such as failed initialization, overfitting on input images, and a lack of details. To mitigate these challenges, we introduce LM-Gaussian, a method capable of generating high-quality reconstructions from a limited number of images. Specifically, we propose a robust initialization module that leverages stereo priors to aid in the recovery of camera poses and the reliable point clouds. Additionally, a diffusion-based refinement is iteratively applied to incorporate image diffusion priors into the Gaussian optimization process to preserve intricate scene details. Finally, we utilize video diffusion priors to further enhance the rendered images for realistic visual effects. Overall, our approach significantly reduces the data acquisition requirements compared to previous 3DGS methods. We validate the effectiveness of our framework through experiments on various public datasets, demonstrating its potential for high-quality 360-degree scene reconstruction. Visual results are on our website.

9/19/2024