Do High-Performance Image-to-Image Translation Networks Enable the Discovery of Radiomic Features? Application to MRI Synthesis from Ultrasound in Prostate Cancer

Read original: arXiv:2403.18651 - Published 7/29/2024 by Mohammad R. Salmanpour, Amin Mousavi, Yixi Xu, William B Weeks, Ilker Hacihaliloglu

🤿

Overview

This study investigates the capabilities of different image-to-image translation networks in medical imaging, specifically their ability to synthesize MRI from Ultrasound images of prostate cancer patients.
Five well-known networks were evaluated, including 2DPix2Pix, 2DCycleGAN, 3DCycleGAN, 3DUNET, and 3DAutoEncoder.
The study assessed the networks' performance using common evaluation metrics and also examined their ability to capture low-level radiomics features.
Additionally, a qualitative evaluation by medical experts was conducted to further understand the networks' capabilities.

Plain English Explanation

The researchers wanted to see how well different AI models could take medical ultrasound images and generate matching MRI images. This is called "image-to-image translation" and could be helpful for doctors who don't have access to expensive MRI scanners.

The researchers tested five popular AI models for this task, including 2DPix2Pix, 2DCycleGAN, and others. They looked at how accurate the generated MRI images were compared to real MRI scans, using standard metrics like Structural Similarity Index (SSIM).

Interestingly, the researchers also analyzed whether these AI models could detect subtle features in the images, called "radiomics features," that might be useful for diagnosing or monitoring prostate cancer. They found that while the models could generate high-quality MRI images, they struggled to fully capture all the low-level details that doctors would need to make medical decisions.

Overall, this study suggests that while these AI translation models show promise, there is still work to be done to make them truly useful in real clinical settings where doctors need to see all the relevant details in medical images.

Technical Explanation

The study used data from 794 prostate cancer patients to evaluate the performance of five prominent image-to-image translation networks in medical imaging: 2DPix2Pix, 2DCycleGAN, 3DCycleGAN, 3DUNET, and 3DAutoEncoder.

The networks were tasked with synthesizing MRI images from Ultrasound images, and their performance was assessed using common evaluation metrics such as Mean Absolute Error, Mean Square Error, Structural Similarity Index (SSIM), and Peak Signal to Noise Ratio. The networks achieved high SSIM scores, often exceeding 0.95.

However, the researchers also conducted a more in-depth analysis using radiomics features (RF) to investigate whether the high-performing networks could capture low-level details in the images. They found that the 2DPix2Pix algorithm was able to recover 75 out of 186 RF, while the other networks lost a significant portion of the RFs during the translation process.

To further understand the networks' capabilities, the researchers also had five medical experts conduct a qualitative assessment, which indicated a lack of low-level feature discovery in the generated images, despite the strong numerical performance metrics.

Critical Analysis

While the image-to-image translation networks achieved impressive quantitative results, the study highlights some important limitations and areas for further research:

The inability of the networks to fully capture low-level radiomics features, which are crucial for medical decision-making, suggests that more work is needed to [object Object].
The qualitative assessment by medical experts underscores the importance of incorporating [object Object] when evaluating the suitability of these models for real-world clinical environments.
The study was limited to a specific use case (prostate cancer) and a relatively small dataset. Further research is needed to [object Object] across a wider range of medical imaging modalities and clinical scenarios.
The study did not explore the potential impact of [object Object] to improve the networks' ability to capture clinically relevant features.

Conclusion

This study provides valuable insights into the current capabilities and limitations of image-to-image translation networks in medical imaging. While the networks achieved high numerical performance, the inability to fully capture low-level radiomics features and the lack of alignment with expert qualitative assessments suggest that more work is needed to develop medical image translation models that are truly suitable for clinical practice.

By addressing these challenges and incorporating domain-specific expertise, future research may unlock the full potential of these techniques to enhance medical diagnosis, treatment planning, and patient outcomes, especially in resource-constrained settings where access to advanced imaging modalities is limited.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🤿

Do High-Performance Image-to-Image Translation Networks Enable the Discovery of Radiomic Features? Application to MRI Synthesis from Ultrasound in Prostate Cancer

Mohammad R. Salmanpour, Amin Mousavi, Yixi Xu, William B Weeks, Ilker Hacihaliloglu

This study investigates the foundational characteristics of image-to-image translation networks, specifically examining their suitability and transferability within the context of routine clinical environments, despite achieving high levels of performance, as indicated by a Structural Similarity Index (SSIM) exceeding 0.95. The evaluation study was conducted using data from 794 patients diagnosed with Prostate cancer. To synthesize MRI from Ultrasound images, we employed five widely recognized image to image translation networks in medical imaging: 2DPix2Pix, 2DCycleGAN, 3DCycleGAN, 3DUNET, and 3DAutoEncoder. For quantitative assessment, we report four prevalent evaluation metrics Mean Absolute Error, Mean Square Error, Structural Similarity Index (SSIM), and Peak Signal to Noise Ratio. Moreover, a complementary analysis employing Radiomic features (RF) via Spearman correlation coefficient was conducted to investigate, for the first time, whether networks achieving high performance, SSIM greater than 0.85, could identify low-level RFs. The RF analysis showed 75 features out of 186 RFs were discovered via just 2DPix2Pix algorithm while half of RFs were lost in the translation process. Finally, a detailed qualitative assessment by five medical doctors indicated a lack of low level feature discovery in image to image translation tasks.

7/29/2024

🧪

Similarity Metrics for MR Image-To-Image Translation

Melanie Dohmen, Mark Klemens, Ivo Baltruschat, Tuan Truong, Matthias Lenga

Image-to-image translation can create large impact in medical imaging, for instance the possibility to synthetically transform images to other modalities, sequence types, higher resolutions or lower noise levels. In order to assure a high level of patient safety, these methods are mostly validated by human reader studies, which require a considerable amount of time and costs. Quantitative metrics have been used to complement such studies and to provide reproducible and objective assessment of synthetic images. Even though the SSIM and PSNR metrics are extensively used, they do not detect all types of errors in synthetic images as desired. Other metrics could provide additional useful evaluation. In this study, we give an overview and a quantitative analysis of 15 metrics for assessing the quality of synthetically generated images. We include 11 full-reference metrics (SSIM, MS-SSIM, CW-SSIM, PSNR, MSE, NMSE, MAE, LPIPS, DISTS, NMI and PCC), three non-reference metrics (BLUR, MLC, MSLC) and one downstream task segmentation metric (DICE) to detect 11 kinds of typical distortions and artifacts that occur in MR images. In addition, we analyze the influence of four prominent normalization methods (Minmax, cMinmax, Zscore and Quantile) on the different metrics and distortions. Finally, we provide adverse examples to highlight pitfalls in metric assessment and derive recommendations for effective usage of the analyzed similarity metrics for evaluation of image-to-image translation models.

6/19/2024

Domain Transfer Through Image-to-Image Translation for Uncertainty-Aware Prostate Cancer Classification

Meng Zhou, Amoon Jamzad, Jason Izard, Alexandre Menard, Robert Siemens, Parvin Mousavi

Prostate Cancer (PCa) is a prevalent disease among men, and multi-parametric MRIs offer a non-invasive method for its detection. While MRI-based deep learning solutions have shown promise in supporting PCa diagnosis, acquiring sufficient training data, particularly in local clinics remains challenging. One potential solution is to take advantage of publicly available datasets to pre-train deep models and fine-tune them on the local data, but multi-source MRIs can pose challenges due to cross-domain distribution differences. These limitations hinder the adoption of explainable and reliable deep-learning solutions in local clinics for PCa diagnosis. In this work, we present a novel approach for unpaired image-to-image translation of prostate multi-parametric MRIs and an uncertainty-aware training approach for classifying clinically significant PCa, to be applied in data-constrained settings such as local and small clinics. Our approach involves a novel pipeline for translating unpaired 3.0T multi-parametric prostate MRIs to 1.5T, thereby augmenting the available training data. Additionally, we introduce an evidential deep learning approach to estimate model uncertainty and employ dataset filtering techniques during training. Furthermore, we propose a simple, yet efficient Evidential Focal Loss, combining focal loss with evidential uncertainty, to train our model effectively. Our experiments demonstrate that the proposed method significantly improves the Area Under ROC Curve (AUC) by over 20% compared to the previous work. Our code is available at https://github.com/med-i-lab/DT_UE_PCa

6/4/2024

Rethinking Perceptual Metrics for Medical Image Translation

Nicholas Konz, Yuwen Chen, Hanxue Gu, Haoyu Dong, Maciej A. Mazurowski

Modern medical image translation methods use generative models for tasks such as the conversion of CT images to MRI. Evaluating these methods typically relies on some chosen downstream task in the target domain, such as segmentation. On the other hand, task-agnostic metrics are attractive, such as the network feature-based perceptual metrics (e.g., FID) that are common to image translation in general computer vision. In this paper, we investigate evaluation metrics for medical image translation on two medical image translation tasks (GE breast MRI to Siemens breast MRI and lumbar spine MRI to CT), tested on various state-of-the-art translation methods. We show that perceptual metrics do not generally correlate with segmentation metrics due to them extending poorly to the anatomical constraints of this sub-field, with FID being especially inconsistent. However, we find that the lesser-used pixel-level SWD metric may be useful for subtle intra-modality translation. Our results demonstrate the need for further research into helpful metrics for medical image translation.

4/12/2024