C^2RV: Cross-Regional and Cross-View Learning for Sparse-View CBCT Reconstruction

Read original: arXiv:2406.03902 - Published 6/7/2024 by Yiqun Lin, Jiewen Yang, Hualiang Wang, Xinpeng Ding, Wei Zhao, Xiaomeng Li

C^2RV: Cross-Regional and Cross-View Learning for Sparse-View CBCT Reconstruction

Overview

• This paper presents a deep learning-based approach called C^2RV (Cross-Regional and Cross-View Learning) for sparse-view cone-beam computed tomography (CBCT) reconstruction.

• The key idea is to leverage cross-regional and cross-view information to improve the quality of CBCT reconstructions from sparse-view data, which is important for reducing patient radiation exposure.

• The proposed method outperforms state-of-the-art techniques on various CBCT reconstruction tasks, demonstrating the effectiveness of the cross-regional and cross-view learning strategy.

Plain English Explanation

CBCT is a type of medical imaging that uses x-rays to create 3D models of the inside of the body. However, taking too many x-ray images can be harmful to patients by exposing them to higher levels of radiation. The goal of this research is to develop a way to reconstruct high-quality 3D images from just a few x-ray images, reducing the radiation dose patients receive.

The key innovation is a deep learning model called C^2RV that can leverage information from different regions of the body and different viewing angles to improve the reconstruction quality, even when only a small number of x-ray images are available. By combining data from multiple perspectives, the model can fill in the missing information and generate accurate 3D images using just a sparse set of x-ray scans.

The researchers show that C^2RV outperforms other state-of-the-art methods for sparse-view CBCT reconstruction, making it a promising approach for clinical applications that require lower radiation exposure, such as internal link: accurate patient alignment without unnecessary imaging dose.

Technical Explanation

The C^2RV model is designed to learn cross-regional and cross-view information to enhance the reconstruction of CBCT images from sparse-view data. The architecture consists of two main components:

A cross-regional learning module that captures the shared features across different anatomical regions within the same CBCT scan.
A cross-view learning module that exploits the complementary information between different viewing angles of the same anatomy.

These two modules are integrated into a unified network that can effectively leverage both regional and view-specific knowledge to produce high-quality reconstructions from sparse-view inputs. The model is trained end-to-end using a combination of reconstruction and adversarial losses.

The authors evaluate C^2RV on several CBCT reconstruction tasks, including internal link: multi-scale diffusion for ultra-sparse reconstruction, internal link: implicit neural representations for robust joint sparse-view reconstruction, and internal link: dual-domain unfolding for CT reconstruction. The results show that C^2RV outperforms state-of-the-art methods across these tasks, demonstrating the effectiveness of the cross-regional and cross-view learning approach.

Critical Analysis

The paper presents a compelling approach to sparse-view CBCT reconstruction, but there are a few potential limitations and areas for further research:

The performance of C^2RV may be sensitive to the specific anatomy and imaging characteristics of the CBCT scans used during training. Evaluating the model's generalization to a wider range of anatomical regions and imaging modalities would be valuable.
The paper does not provide a detailed analysis of the model's computational complexity and inference time, which are important factors for clinical deployment. Further investigations into the efficiency of the C^2RV architecture would be helpful.
While the cross-regional and cross-view learning strategy is a key innovation, the paper does not explore the relative contributions of these two components to the overall performance. Ablation studies could shed light on the importance of each module.
Exploring ways to further integrate internal link: space-variant total variation boosted by learning or other regularization techniques could potentially improve the robustness and consistency of the reconstructions.

Conclusion

The C^2RV model presented in this paper is a promising deep learning-based approach for sparse-view CBCT reconstruction. By leveraging cross-regional and cross-view information, the model can generate high-quality 3D reconstructions from a limited number of x-ray images, which is crucial for reducing patient radiation exposure in medical imaging applications. The strong performance of C^2RV on various benchmark tasks highlights the effectiveness of this cross-modal learning strategy, and further research into its generalization and efficiency could lead to valuable clinical innovations.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

C^2RV: Cross-Regional and Cross-View Learning for Sparse-View CBCT Reconstruction

Yiqun Lin, Jiewen Yang, Hualiang Wang, Xinpeng Ding, Wei Zhao, Xiaomeng Li

Cone beam computed tomography (CBCT) is an important imaging technology widely used in medical scenarios, such as diagnosis and preoperative planning. Using fewer projection views to reconstruct CT, also known as sparse-view reconstruction, can reduce ionizing radiation and further benefit interventional radiology. Compared with sparse-view reconstruction for traditional parallel/fan-beam CT, CBCT reconstruction is more challenging due to the increased dimensionality caused by the measurement process based on cone-shaped X-ray beams. As a 2D-to-3D reconstruction problem, although implicit neural representations have been introduced to enable efficient training, only local features are considered and different views are processed equally in previous works, resulting in spatial inconsistency and poor performance on complicated anatomies. To this end, we propose C^2RV by leveraging explicit multi-scale volumetric representations to enable cross-regional learning in the 3D space. Additionally, the scale-view cross-attention module is introduced to adaptively aggregate multi-scale and multi-view features. Extensive experiments demonstrate that our C^2RV achieves consistent and significant improvement over previous state-of-the-art methods on datasets with diverse anatomy.

6/7/2024

Learning 3D Gaussians for Extremely Sparse-View Cone-Beam CT Reconstruction

Yiqun Lin, Hualiang Wang, Jixiang Chen, Xiaomeng Li

Cone-Beam Computed Tomography (CBCT) is an indispensable technique in medical imaging, yet the associated radiation exposure raises concerns in clinical practice. To mitigate these risks, sparse-view reconstruction has emerged as an essential research direction, aiming to reduce the radiation dose by utilizing fewer projections for CT reconstruction. Although implicit neural representations have been introduced for sparse-view CBCT reconstruction, existing methods primarily focus on local 2D features queried from sparse projections, which is insufficient to process the more complicated anatomical structures, such as the chest. To this end, we propose a novel reconstruction framework, namely DIF-Gaussian, which leverages 3D Gaussians to represent the feature distribution in the 3D space, offering additional 3D spatial information to facilitate the estimation of attenuation coefficients. Furthermore, we incorporate test-time optimization during inference to further improve the generalization capability of the model. We evaluate DIF-Gaussian on two public datasets, showing significantly superior reconstruction performance than previous state-of-the-art methods.

7/9/2024

MVMS-RCN: A Dual-Domain Unfolding CT Reconstruction with Multi-sparse-view and Multi-scale Refinement-correction

Xiaohong Fan, Ke Chen, Huaming Yi, Yin Yang, Jianping Zhang

X-ray Computed Tomography (CT) is one of the most important diagnostic imaging techniques in clinical applications. Sparse-view CT imaging reduces the number of projection views to a lower radiation dose and alleviates the potential risk of radiation exposure. Most existing deep learning (DL) and deep unfolding sparse-view CT reconstruction methods: 1) do not fully use the projection data; 2) do not always link their architecture designs to a mathematical theory; 3) do not flexibly deal with multi-sparse-view reconstruction assignments. This paper aims to use mathematical ideas and design optimal DL imaging algorithms for sparse-view tomography reconstructions. We propose a novel dual-domain deep unfolding unified framework that offers a great deal of flexibility for multi-sparse-view CT reconstruction with different sampling views through a single model. This framework combines the theoretical advantages of model-based methods with the superior reconstruction performance of DL-based methods, resulting in the expected generalizability of DL. We propose a refinement module that utilizes unfolding projection domain to refine full-sparse-view projection errors, as well as an image domain correction module that distills multi-scale geometric error corrections to reconstruct sparse-view CT. This provides us with a new way to explore the potential of projection information and a new perspective on designing network architectures. All parameters of our proposed framework are learnable end to end, and our method possesses the potential to be applied to plug-and-play reconstruction. Extensive experiments demonstrate that our framework is superior to other existing state-of-the-art methods. Our source codes are available at https://github.com/fanxiaohong/MVMS-RCN.

5/28/2024

🧠

Implicit Neural Representations for Robust Joint Sparse-View CT Reconstruction

Jiayang Shi, Junyi Zhu, Daniel M. Pelt, K. Joost Batenburg, Matthew B. Blaschko

Computed Tomography (CT) is pivotal in industrial quality control and medical diagnostics. Sparse-view CT, offering reduced ionizing radiation, faces challenges due to its under-sampled nature, leading to ill-posed reconstruction problems. Recent advancements in Implicit Neural Representations (INRs) have shown promise in addressing sparse-view CT reconstruction. Recognizing that CT often involves scanning similar subjects, we propose a novel approach to improve reconstruction quality through joint reconstruction of multiple objects using INRs. This approach can potentially leverage both the strengths of INRs and the statistical regularities across multiple objects. While current INR joint reconstruction techniques primarily focus on accelerating convergence via meta-initialization, they are not specifically tailored to enhance reconstruction quality. To address this gap, we introduce a novel INR-based Bayesian framework integrating latent variables to capture the inter-object relationships. These variables serve as a dynamic reference throughout the optimization, thereby enhancing individual reconstruction fidelity. Our extensive experiments, which assess various key factors such as reconstruction quality, resistance to overfitting, and generalizability, demonstrate significant improvements over baselines in common numerical metrics. This underscores a notable advancement in CT reconstruction methods.

5/7/2024