2.5D Multi-view Averaging Diffusion Model for 3D Medical Image Translation: Application to Low-count PET Reconstruction with CT-less Attenuation Correction

2406.08374

Published 6/18/2024 by Tianqi Chen, Jun Hou, Yinchi Zhou, Huidong Xie, Xiongchao Chen, Qiong Liu, Xueqi Guo, Menghua Xia, James S. Duncan, Chi Liu and 1 other

cs.CV cs.AI eess.IV

2.5D Multi-view Averaging Diffusion Model for 3D Medical Image Translation: Application to Low-count PET Reconstruction with CT-less Attenuation Correction

Abstract

Positron Emission Tomography (PET) is an important clinical imaging tool but inevitably introduces radiation hazards to patients and healthcare providers. Reducing the tracer injection dose and eliminating the CT acquisition for attenuation correction can reduce the overall radiation dose, but often results in PET with high noise and bias. Thus, it is desirable to develop 3D methods to translate the non-attenuation-corrected low-dose PET (NAC-LDPET) into attenuation-corrected standard-dose PET (AC-SDPET). Recently, diffusion models have emerged as a new state-of-the-art deep learning method for image-to-image translation, better than traditional CNN-based methods. However, due to the high computation cost and memory burden, it is largely limited to 2D applications. To address these challenges, we developed a novel 2.5D Multi-view Averaging Diffusion Model (MADM) for 3D image-to-image translation with application on NAC-LDPET to AC-SDPET translation. Specifically, MADM employs separate diffusion models for axial, coronal, and sagittal views, whose outputs are averaged in each sampling step to ensure the 3D generation quality from multiple views. To accelerate the 3D sampling process, we also proposed a strategy to use the CNN-based 3D generation as a prior for the diffusion model. Our experimental results on human patient studies suggested that MADM can generate high-quality 3D translation images, outperforming previous CNN-based and Diffusion-based baseline methods.

Create account to get full access

Overview

This paper introduces a 2.5D multi-view averaging diffusion model for 3D medical image translation, with a focus on low-count PET reconstruction with CT-less attenuation correction.
The proposed model leverages multiple 2D views to generate a 3D output, addressing the challenges of low-count PET data and the need for CT-based attenuation correction.
The model is demonstrated on the task of reconstructing high-quality 3D PET images from low-count PET data, without requiring a CT scan for attenuation correction.

Plain English Explanation

In medical imaging, doctors often use PET (Positron Emission Tomography) scans to get a detailed look at the inside of the body. PET scans can provide valuable information about the function and metabolism of organs and tissues. However, PET scans can sometimes have a low number of detected particles, leading to noisy and low-quality images.

Traditionally, PET scans have relied on CT (Computed Tomography) scans to correct for the absorption of particles in the body, a process called attenuation correction. This allows the PET images to be more accurate and reliable. However, getting both a PET and a CT scan can be time-consuming and expensive for patients.

The researchers in this paper have developed a new AI model that can generate high-quality 3D PET images from low-count PET data, without needing a CT scan for attenuation correction. The model works by taking multiple 2D views of the PET data and combining them to create a 3D output. This 2.5D approach helps the model better understand the 3D structure of the patient's body, even with limited PET data.

By eliminating the need for a CT scan, this model could make PET imaging more accessible and affordable for patients, while still providing doctors with the high-quality images they need to make accurate diagnoses and treatment decisions.

Technical Explanation

The researchers propose a 2.5D multi-view averaging diffusion model for 3D medical image translation, with a specific application to low-count PET reconstruction with CT-less attenuation correction.

The model is designed to address the challenges of low-count PET data and the need for CT-based attenuation correction. The 2.5D approach leverages multiple 2D views of the PET data, which are then combined to generate a high-quality 3D output.

The multi-view diffusion model is trained on paired PET and CT data, learning to generate the CT-based attenuation map from the PET data. During inference, the model can then produce the attenuation-corrected PET image without requiring a CT scan.

The researchers also explore partitioned Hankel-based diffusion models and cascaded multi-path shortcut diffusion models to further improve the model's performance and robustness.

The proposed physics-informed score-based diffusion model leverages the physical properties of the PET imaging process to enhance the model's ability to generate accurate and realistic PET images from limited data.

Critical Analysis

The paper presents a promising approach to addressing the challenges of low-count PET imaging and the need for CT-based attenuation correction. The 2.5D multi-view averaging diffusion model demonstrates the ability to generate high-quality 3D PET images from limited PET data, without requiring a separate CT scan.

One potential limitation is the reliance on paired PET and CT data for training the model. In practice, not all patients may have access to both PET and CT scans, which could hinder the model's deployment in certain clinical settings.

Additionally, the paper does not provide a detailed analysis of the model's performance on a diverse range of PET data, such as different anatomical regions or disease states. Further evaluation on a broader range of medical images would help assess the generalizability and robustness of the proposed approach.

The paper also does not address the computational and infrastructure requirements for deploying the model in a clinical setting. Considerations around model inference speed, memory usage, and integration with existing medical imaging workflows would be important for practical implementation.

Overall, the research presented in this paper represents an innovative and potentially impactful contribution to the field of medical image translation. However, additional validation, robustness testing, and feasibility analysis would be beneficial to fully understand the broader applicability and limitations of the approach.

Conclusion

This paper introduces a novel 2.5D multi-view averaging diffusion model for 3D medical image translation, with a specific application to low-count PET reconstruction with CT-less attenuation correction. The model leverages multiple 2D views of PET data to generate high-quality 3D PET images, addressing the challenges of low-count PET data and the need for CT-based attenuation correction.

The proposed approach has the potential to make PET imaging more accessible and affordable for patients, while still providing doctors with the detailed information they need for accurate diagnoses and treatment decisions. Further research and validation on a broader range of medical images, as well as considerations around clinical implementation, would help solidify the impact and practical applications of this technology.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Dose-aware Diffusion Model for 3D Low-dose PET: Multi-institutional Validation with Reader Study and Real Low-dose Data

Huidong Xie, Weijie Gan, Bo Zhou, Ming-Kai Chen, Michal Kulon, Annemarie Boustani, Benjamin A. Spencer, Reimund Bayerlein, Xiongchao Chen, Qiong Liu, Xueqi Guo, Menghua Xia, Yinchi Zhou, Hui Liu, Liang Guo, Hongyu An, Ulugbek S. Kamilov, Hanzhong Wang, Biao Li, Axel Rominger, Kuangyu Shi, Ge Wang, Ramsey D. Badawi, Chi Liu

As PET imaging is accompanied by radiation exposure and potentially increased cancer risk, reducing radiation dose in PET scans without compromising the image quality is an important topic. Deep learning (DL) techniques have been investigated for low-dose PET imaging. However, existing models have often resulted in compromised image quality when achieving low-dose PET and have limited generalizability to different image noise-levels, acquisition protocols, patient populations, and hospitals. Recently, diffusion models have emerged as the new state-of-the-art generative model to generate high-quality samples and have demonstrated strong potential for medical imaging tasks. However, for low-dose PET imaging, existing diffusion models failed to generate consistent 3D reconstructions, unable to generalize across varying noise-levels, often produced visually-appealing but distorted image details, and produced images with biased tracer uptake. Here, we develop DDPET-3D, a dose-aware diffusion model for 3D low-dose PET imaging to address these challenges. Collected from 4 medical centers globally with different scanners and clinical protocols, we extensively evaluated the proposed model using a total of 9,783 18F-FDG studies (1,596 patients) with low-dose/low-count levels ranging from 1% to 50%. With a cross-center, cross-scanner validation, the proposed DDPET-3D demonstrated its potential to generalize to different low-dose levels, different scanners, and different clinical protocols. As confirmed with reader studies performed by nuclear medicine physicians, the proposed method produced superior denoised results that are comparable to or even better than the 100% full-count images as well as previous DL baselines. The presented results show the potential of achieving low-dose PET while maintaining image quality. Lastly, a group of real low-dose scans was also included for evaluation.

5/24/2024

eess.IV

📈

New!Diffusion Transformer Model With Compact Prior for Low-dose PET Reconstruction

Bin Huang, Xubiao Liu, Lei Fang, Qiegen Liu, Bingxuan Li

Positron emission tomography (PET) is an advanced medical imaging technique that plays a crucial role in non-invasive clinical diagnosis. However, while reducing radiation exposure through low-dose PET scans is beneficial for patient safety, it often results in insufficient statistical data. This scarcity of data poses significant challenges for accurately reconstructing high-quality images, which are essential for reliable diagnostic outcomes. In this research, we propose a diffusion transformer model (DTM) guided by joint compact prior (JCP) to enhance the reconstruction quality of low-dose PET imaging. In light of current research findings, we present a pioneering PET reconstruction model that integrates diffusion and transformer models for joint optimization. This model combines the powerful distribution mapping abilities of diffusion models with the capacity of transformers to capture long-range dependencies, offering significant advantages for low-dose PET reconstruction. Additionally, the incorporation of the lesion refining block and penalized weighted least squares (PWLS) enhance the recovery capability of lesion regions and preserves detail information, solving blurring problems in lesion areas and texture details of most deep learning frameworks. Experimental results demonstrate the effectiveness of DTM in enhancing image quality and preserving critical clinical information for low-dose PET scans. Our approach not only reduces radiation exposure risks but also provides a more reliable PET imaging tool for early disease detection and patient management.

7/2/2024

cs.CV

📈

MCAD: Multi-modal Conditioned Adversarial Diffusion Model for High-Quality PET Image Reconstruction

Jiaqi Cui, Xinyi Zeng, Pinxian Zeng, Bo Liu, Xi Wu, Jiliu Zhou, Yan Wang

Radiation hazards associated with standard-dose positron emission tomography (SPET) images remain a concern, whereas the quality of low-dose PET (LPET) images fails to meet clinical requirements. Therefore, there is great interest in reconstructing SPET images from LPET images. However, prior studies focus solely on image data, neglecting vital complementary information from other modalities, e.g., patients' clinical tabular, resulting in compromised reconstruction with limited diagnostic utility. Moreover, they often overlook the semantic consistency between real SPET and reconstructed images, leading to distorted semantic contexts. To tackle these problems, we propose a novel Multi-modal Conditioned Adversarial Diffusion model (MCAD) to reconstruct SPET images from multi-modal inputs, including LPET images and clinical tabular. Specifically, our MCAD incorporates a Multi-modal conditional Encoder (Mc-Encoder) to extract multi-modal features, followed by a conditional diffusion process to blend noise with multi-modal features and gradually map blended features to the target SPET images. To balance multi-modal inputs, the Mc-Encoder embeds Optimal Multi-modal Transport co-Attention (OMTA) to narrow the heterogeneity gap between image and tabular while capturing their interactions, providing sufficient guidance for reconstruction. In addition, to mitigate semantic distortions, we introduce the Multi-Modal Masked Text Reconstruction (M3TRec), which leverages semantic knowledge extracted from denoised PET images to restore the masked clinical tabular, thereby compelling the network to maintain accurate semantics during reconstruction. To expedite the diffusion process, we further introduce an adversarial diffusive network with a reduced number of diffusion steps. Experiments show that our method achieves the state-of-the-art performance both qualitatively and quantitatively.

6/21/2024

eess.IV cs.CV

MVDiff: Scalable and Flexible Multi-View Diffusion for 3D Object Reconstruction from Single-View

Emmanuelle Bourigault, Pauline Bourigault

Generating consistent multiple views for 3D reconstruction tasks is still a challenge to existing image-to-3D diffusion models. Generally, incorporating 3D representations into diffusion model decrease the model's speed as well as generalizability and quality. This paper proposes a general framework to generate consistent multi-view images from single image or leveraging scene representation transformer and view-conditioned diffusion model. In the model, we introduce epipolar geometry constraints and multi-view attention to enforce 3D consistency. From as few as one image input, our model is able to generate 3D meshes surpassing baselines methods in evaluation metrics, including PSNR, SSIM and LPIPS.

6/14/2024

cs.CV cs.LG