DiffusionBlend: Learning 3D Image Prior through Position-aware Diffusion Score Blending for 3D Computed Tomography Reconstruction

2406.10211

Published 6/17/2024 by Bowen Song, Jason Hu, Zhaoxu Luo, Jeffrey A. Fessler, Liyue Shen

DiffusionBlend: Learning 3D Image Prior through Position-aware Diffusion Score Blending for 3D Computed Tomography Reconstruction

Abstract

Diffusion models face significant challenges when employed for large-scale medical image reconstruction in real practice such as 3D Computed Tomography (CT). Due to the demanding memory, time, and data requirements, it is difficult to train a diffusion model directly on the entire volume of high-dimensional data to obtain an efficient 3D diffusion prior. Existing works utilizing diffusion priors on single 2D image slice with hand-crafted cross-slice regularization would sacrifice the z-axis consistency, which results in severe artifacts along the z-axis. In this work, we propose a novel framework that enables learning the 3D image prior through position-aware 3D-patch diffusion score blending for reconstructing large-scale 3D medical images. To the best of our knowledge, we are the first to utilize a 3D-patch diffusion prior for 3D medical image reconstruction. Extensive experiments on sparse view and limited angle CT reconstruction show that our DiffusionBlend method significantly outperforms previous methods and achieves state-of-the-art performance on real-world CT reconstruction problems with high-dimensional 3D image (i.e., $256 times 256 times 500$). Our algorithm also comes with better or comparable computational efficiency than previous state-of-the-art methods.

Create account to get full access

Overview

This paper proposes a novel method called "DiffusionBlend" for 3D computed tomography (CT) reconstruction that leverages position-aware diffusion score blending to learn an effective 3D image prior.
The method combines position-aware diffusion scores from multiple planes to improve the quality of 3D CT reconstructions.
Experiments on various 3D CT datasets demonstrate the effectiveness of DiffusionBlend compared to state-of-the-art approaches.

Plain English Explanation

DiffusionBlend is a new technique for improving the quality of 3D medical images created using CT scans. CT scans produce 2D slices of the body, which are then combined to create a 3D image. However, the process of combining these 2D slices is challenging and can result in low-quality 3D images.

The key idea behind DiffusionBlend is to use "diffusion scores" - a way of measuring the noise and structure in the 2D slices - to better combine them into a high-quality 3D image. Specifically, the method looks at the diffusion scores for different positions within the 2D slices and blends them together in a smart way to capture the 3D structure of the object being imaged.

By using this position-aware diffusion score blending approach, DiffusionBlend is able to produce 3D CT reconstructions that are significantly better than what can be achieved using other state-of-the-art methods. This is an important advance, as high-quality 3D medical images are essential for accurate diagnosis and treatment planning.

Technical Explanation

The key technical components of DiffusionBlend are:

Diffusion Score Estimation: The method first estimates a "diffusion score" for each 2D slice of the CT scan. The diffusion score captures information about the noise and structure in the image, which will be important for reconstructing the 3D volume.
Position-aware Blending: Rather than simply averaging the 2D diffusion scores, DiffusionBlend considers the position of each voxel within the 3D volume and blends the diffusion scores in a position-aware manner. This allows it to better capture the 3D structure of the object being imaged.
3D CT Reconstruction: The position-aware blended diffusion scores are then used to guide the reconstruction of the final 3D CT volume, leading to significantly improved image quality compared to other approaches.

The paper evaluates DiffusionBlend on several 3D CT datasets and shows that it outperforms state-of-the-art methods such as Learning Image Priors through Patch-Based Diffusion, DiffusionDollar2Dollar: Dynamic 3D Content Generation via Score, and CT Reconstruction Using Diffusion Posterior Sampling Conditioned in terms of various image quality metrics.

Critical Analysis

The paper provides a thorough evaluation of DiffusionBlend and demonstrates its effectiveness for 3D CT reconstruction. However, some potential limitations and areas for further research are worth noting:

The method assumes that the 2D slices are already well-aligned, which may not always be the case in practice. Extending DiffusionBlend to handle misaligned slices could further improve its real-world applicability.
The paper does not explore the trade-offs between reconstruction quality and computational efficiency. Investigating ways to improve the computational efficiency of DiffusionBlend could make it more suitable for practical deployment.
While the paper compares DiffusionBlend to several state-of-the-art methods, it would be interesting to see how it performs against other 3D reconstruction approaches, such as those using Physics-Informed Score-Based Diffusion Model for Limited or 2.5D Multi-View Averaging Diffusion Model for 3D data.

Overall, DiffusionBlend represents an important advance in 3D CT reconstruction, and the ideas presented in this paper could inspire further research and development in this critical field of medical imaging.

Conclusion

The DiffusionBlend method proposed in this paper offers a novel approach to 3D computed tomography reconstruction by leveraging position-aware diffusion score blending to learn an effective 3D image prior. The experimental results demonstrate the effectiveness of this method in producing high-quality 3D reconstructions, which could have significant implications for medical diagnosis and treatment planning. While the paper identifies some potential areas for future work, DiffusionBlend represents an important step forward in the field of 3D medical imaging.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Learning Image Priors through Patch-based Diffusion Models for Solving Inverse Problems

Jason Hu, Bowen Song, Xiaojian Xu, Liyue Shen, Jeffrey A. Fessler

Diffusion models can learn strong image priors from underlying data distribution and use them to solve inverse problems, but the training process is computationally expensive and requires lots of data. Such bottlenecks prevent most existing works from being feasible for high-dimensional and high-resolution data such as 3D images. This paper proposes a method to learn an efficient data prior for the entire image by training diffusion models only on patches of images. Specifically, we propose a patch-based position-aware diffusion inverse solver, called PaDIS, where we obtain the score function of the whole image through scores of patches and their positional encoding and utilize this as the prior for solving inverse problems. First of all, we show that this diffusion model achieves an improved memory efficiency and data efficiency while still maintaining the capability to generate entire images via positional encoding. Additionally, the proposed PaDIS model is highly flexible and can be plugged in with different diffusion inverse solvers (DIS). We demonstrate that the proposed PaDIS approach enables solving various inverse problems in both natural and medical image domains, including CT reconstruction, deblurring, and superresolution, given only patch-based priors. Notably, PaDIS outperforms previous DIS methods trained on entire image priors in the case of limited training data, demonstrating the data efficiency of our proposed approach by learning patch-based prior.

6/5/2024

cs.CV cs.AI

CT Reconstruction using Diffusion Posterior Sampling conditioned on a Nonlinear Measurement Model

Shudong Li, Xiao Jiang, Matthew Tivnan, Grace J. Gang, Yuan Shen, J. Webster Stayman

Diffusion models have been demonstrated as powerful deep learning tools for image generation in CT reconstruction and restoration. Recently, diffusion posterior sampling, where a score-based diffusion prior is combined with a likelihood model, has been used to produce high quality CT images given low-quality measurements. This technique is attractive since it permits a one-time, unsupervised training of a CT prior; which can then be incorporated with an arbitrary data model. However, current methods rely on a linear model of x-ray CT physics to reconstruct or restore images. While it is common to linearize the transmission tomography reconstruction problem, this is an approximation to the true and inherently nonlinear forward model. We propose a new method that solves the inverse problem of nonlinear CT image reconstruction via diffusion posterior sampling. We implement a traditional unconditional diffusion model by training a prior score function estimator, and apply Bayes rule to combine this prior with a measurement likelihood score function derived from the nonlinear physical model to arrive at a posterior score function that can be used to sample the reverse-time diffusion process. This plug-and-play method allows incorporation of a diffusion-based prior with generalized nonlinear CT image reconstruction into multiple CT system designs with different forward models, without the need for any additional training. We develop the algorithm that performs this reconstruction, including an ordered-subsets variant for accelerated processing and demonstrate the technique in both fully sampled low dose data and sparse-view geometries using a single unsupervised training of the prior.

6/12/2024

cs.CV eess.IV

Diffusion$^2$: Dynamic 3D Content Generation via Score Composition of Orthogonal Diffusion Models

Zeyu Yang, Zijie Pan, Chun Gu, Li Zhang

Recent advancements in 3D generation are predominantly propelled by improvements in 3D-aware image diffusion models which are pretrained on Internet-scale image data and fine-tuned on massive 3D data, offering the capability of producing highly consistent multi-view images. However, due to the scarcity of synchronized multi-view video data, it is impractical to adapt this paradigm to 4D generation directly. Despite that, the available video and 3D data are adequate for training video and multi-view diffusion models separately that can provide satisfactory dynamic and geometric priors respectively. To take advantage of both, this paper present Diffusion$^2$, a novel framework for dynamic 3D content creation that reconciles the knowledge about geometric consistency and temporal smoothness from these models to directly sample dense multi-view multi-frame images which can be employed to optimize continuous 4D representation. Specifically, we design a simple yet effective denoising strategy via score composition of pretrained video and multi-view diffusion models based on the probability structure of the target image array. Owing to the high parallelism of the proposed image generation process and the efficiency of the modern 4D reconstruction pipeline, our framework can generate 4D content within few minutes. Additionally, our method circumvents the reliance on 4D data, thereby having the potential to benefit from the scaling of the foundation video and multi-view diffusion models. Extensive experiments demonstrate the efficacy of our proposed framework and its ability to flexibly handle various types of prompts.

5/24/2024

cs.CV

Physics-informed Score-based Diffusion Model for Limited-angle Reconstruction of Cardiac Computed Tomography

Shuo Han, Yongshun Xu, Dayang Wang, Bahareh Morovati, Li Zhou, Jonathan S. Maltz, Ge Wang, Hengyong Yu

Cardiac computed tomography (CT) has emerged as a major imaging modality for the diagnosis and monitoring of cardiovascular diseases. High temporal resolution is essential to ensure diagnostic accuracy. Limited-angle data acquisition can reduce scan time and improve temporal resolution, but typically leads to severe image degradation and motivates for improved reconstruction techniques. In this paper, we propose a novel physics-informed score-based diffusion model (PSDM) for limited-angle reconstruction of cardiac CT. At the sampling time, we combine a data prior from a diffusion model and a model prior obtained via an iterative algorithm and Fourier fusion to further enhance the image quality. Specifically, our approach integrates the primal-dual hybrid gradient (PDHG) algorithm with score-based diffusion models, thereby enabling us to reconstruct high-quality cardiac CT images from limited-angle data. The numerical simulations and real data experiments confirm the effectiveness of our proposed approach.

5/24/2024

eess.IV