Validation of musculoskeletal segmentation model with uncertainty estimation for bone and muscle assessment in hip-to-knee clinical CT images

Read original: arXiv:2409.02770 - Published 9/5/2024 by Mazen Soufi, Yoshito Otake, Makoto Iwasa, Keisuke Uemura, Tomoki Hakotani, Masahiro Hashimoto, Yoshitake Yamada, Minoru Yamada, Yoichi Yokoyama, Masahiro Jinzaki and 5 others

✅

Overview

Deep learning has enabled automated, accurate, and rapid analysis of musculoskeletal (MSK) structures from medical images.
Current approaches have limitations, such as only analyzing 2D cross-sectional images, addressing few structures, or being validated on small datasets.
This study aimed to validate an improved deep learning model for volumetric MSK segmentation of the hip and thigh, with uncertainty estimation, using clinical computed tomography (CT) images from diverse databases.

Plain English Explanation

Deep learning-based image segmentation has revolutionized the way medical professionals can analyze musculoskeletal (MSK) structures, like bones and muscles, from medical images. This technology allows for fast, accurate, and fully automated analysis, which is a significant improvement over previous manual or semi-automated methods.

However, the existing deep learning approaches for MSK analysis have had some limitations. They have either only worked with 2D cross-sectional images, looked at only a few specific structures, or were tested on small datasets, which can restrict their usefulness in analyzing large-scale medical databases.

This study aimed to address these limitations by developing and validating an improved deep learning model that can perform volumetric (3D) segmentation of the hip and thigh region from clinical CT images. The researchers used CT scans from multiple hospitals, which included patients with different medical conditions and in various positions, to create a robust and versatile model.

In addition to segmenting the MSK structures accurately, the model also provided estimates of the volume and density (measured in Hounsfield units) of the identified structures. The researchers also developed a way for the model to detect when it was making inaccurate or failed segmentations, which is important for ensuring the reliability of the analysis in real-world clinical settings.

Technical Explanation

The improved deep learning model developed in this study used a volumetric approach to segment MSK structures, rather than the 2D cross-sectional analysis done in previous work. This allowed the model to capture the full 3D shape and characteristics of the hip and thigh region.

The model was trained and validated using CT image databases from multiple manufacturers and scanners, encompassing a diverse range of patient disease statuses and positioning. This helped ensure the model's robustness and generalizability to different clinical scenarios.

The segmentation accuracy was evaluated using standard metrics, such as Dice similarity coefficient and Hausdorff distance. The model also provided estimates of the volume and mean density (in Hounsfield units) of the segmented structures, which were compared to ground truth measurements.

Importantly, the researchers developed a method to detect when the model was making inaccurate or failed segmentations, based on the predictive uncertainty of the model's outputs. This was achieved by analyzing the statistical distributions of the model's predictions, which allowed for the identification of unreliable segmentation results.

Critical Analysis

The use of diverse CT image databases from multiple institutions and scanners is a key strength of this study, as it enhances the model's ability to generalize to a wide range of real-world clinical scenarios. However, the paper does not provide details on the specific disease conditions or patient demographics represented in the datasets, which could be important for understanding the model's performance in different subpopulations.

Another potential limitation is that the model was only validated on the hip and thigh region, rather than the full musculoskeletal system. While this focus allows for a more in-depth analysis of a specific anatomical area, it remains to be seen how the model would perform on other MSK structures.

Additionally, while the predictive uncertainty-based failure detection approach is a valuable innovation, the paper does not provide a comprehensive analysis of the model's performance in this regard. Further investigation into the sensitivity and specificity of this failure detection mechanism would be helpful to fully assess its reliability.

Conclusion

This study has demonstrated the development of an improved deep learning model for volumetric segmentation of musculoskeletal structures from clinical CT images. The model's high accuracy in segmentation, volume and density estimation, as well as its ability to detect segmentation failures, suggest it could be a reliable tool for large-scale analysis of MSK structures in diverse medical databases.

These advancements in deep learning-based image analysis have the potential to streamline and enhance the clinical assessment of musculoskeletal health, leading to more efficient and personalized patient care. However, further research is needed to expand the model's capabilities to additional anatomical regions and validate its performance in real-world clinical settings.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

✅

Validation of musculoskeletal segmentation model with uncertainty estimation for bone and muscle assessment in hip-to-knee clinical CT images

Mazen Soufi, Yoshito Otake, Makoto Iwasa, Keisuke Uemura, Tomoki Hakotani, Masahiro Hashimoto, Yoshitake Yamada, Minoru Yamada, Yoichi Yokoyama, Masahiro Jinzaki, Suzushi Kusano, Masaki Takao, Seiji Okada, Nobuhiko Sugano, Yoshinobu Sato

Deep learning-based image segmentation has allowed for the fully automated, accurate, and rapid analysis of musculoskeletal (MSK) structures from medical images. However, current approaches were either applied only to 2D cross-sectional images, addressed few structures, or were validated on small datasets, which limit the application in large-scale databases. This study aimed to validate an improved deep learning model for volumetric MSK segmentation of the hip and thigh with uncertainty estimation from clinical computed tomography (CT) images. Databases of CT images from multiple manufacturers/scanners, disease status, and patient positioning were used. The segmentation accuracy, and accuracy in estimating the structures volume and density, i.e., mean HU, were evaluated. An approach for segmentation failure detection based on predictive uncertainty was also investigated. The model has shown an overall improvement with respect to all segmentation accuracy and structure volume/density evaluation metrics. The predictive uncertainty yielded large areas under the receiver operating characteristic (AUROC) curves (AUROCs>=.95) in detecting inaccurate and failed segmentations. The high segmentation and muscle volume/density estimation accuracy, along with the high accuracy in failure detection based on the predictive uncertainty, exhibited the model's reliability for analyzing individual MSK structures in large-scale CT databases.

9/5/2024

MSTT-199: MRI Dataset for Musculoskeletal Soft Tissue Tumor Segmentation

Tahsin Reasat, Stephen Chenard, Akhil Rekulapelli, Nicholas Chadwick, Joanna Shechtel, Katherine van Schaik, David S. Smith, Joshua Lawrenz

Accurate musculoskeletal soft tissue tumor segmentation is vital for assessing tumor size, location, diagnosis, and response to treatment, thereby influencing patient outcomes. However, segmentation of these tumors requires clinical expertise, and an automated segmentation model would save valuable time for both clinician and patient. Training an automatic model requires a large dataset of annotated images. In this work, we describe the collection of an MR imaging dataset of 199 musculoskeletal soft tissue tumors from 199 patients. We trained segmentation models on this dataset and then benchmarked them on a publicly available dataset. Our model achieved the state-of-the-art dice score of 0.79 out of the box without any fine tuning, which shows the diversity and utility of our curated dataset. We analyzed the model predictions and found that its performance suffered on fibrous and vascular tumors due to their diverse anatomical location, size, and intensity heterogeneity. The code and models are available in the following github repository, https://github.com/Reasat/mstt

9/6/2024

Uncertainty estimates for semantic segmentation: providing enhanced reliability for automated motor claims handling

Jan Kuchler (ControlExpert GmbH, Langenfeld, Germany), Daniel Kroll (ControlExpert GmbH, Langenfeld, Germany), Sebastian Schoenen (ControlExpert GmbH, Langenfeld, Germany), Andreas Witte (ControlExpert GmbH, Langenfeld, Germany)

Deep neural network models for image segmentation can be a powerful tool for the automation of motor claims handling processes in the insurance industry. A crucial aspect is the reliability of the model outputs when facing adverse conditions, such as low quality photos taken by claimants to document damages. We explore the use of a meta-classification model to empirically assess the precision of segments predicted by a model trained for the semantic segmentation of car body parts. Different sets of features correlated with the quality of a segment are compared, and an AUROC score of 0.915 is achieved for distinguishing between high- and low-quality segments. By removing low-quality segments, the average mIoU of the segmentation output is improved by 16 percentage points and the number of wrongly predicted segments is reduced by 77%.

5/20/2024

Deep learning-based brain segmentation model performance validation with clinical radiotherapy CT

Selena Huisman, Matteo Maspero, Marielle Philippens, Joost Verhoeff, Szabolcs David

Manual segmentation of medical images is labor intensive and especially challenging for images with poor contrast or resolution. The presence of disease exacerbates this further, increasing the need for an automated solution. To this extent, SynthSeg is a robust deep learning model designed for automatic brain segmentation across various contrasts and resolutions. This study validates the SynthSeg robust brain segmentation model on computed tomography (CT), using a multi-center dataset. An open access dataset of 260 paired CT and magnetic resonance imaging (MRI) from radiotherapy patients treated in 5 centers was collected. Brain segmentations from CT and MRI were obtained with SynthSeg model, a component of the Freesurfer imaging suite. These segmentations were compared and evaluated using Dice scores and Hausdorff 95 distance (HD95), treating MRI-based segmentations as the ground truth. Brain regions that failed to meet performance criteria were excluded based on automated quality control (QC) scores. Dice scores indicate a median overlap of 0.76 (IQR: 0.65-0.83). The median HD95 is 2.95 mm (IQR: 1.73-5.39). QC score based thresholding improves median dice by 0.1 and median HD95 by 0.05mm. Morphological differences related to sex and age, as detected by MRI, were also replicated with CT, with an approximate 17% difference between the CT and MRI results for sex and 10% difference between the results for age. SynthSeg can be utilized for CT-based automatic brain segmentation, but only in applications where precision is not essential. CT performance is lower than MRI based on the integrated QC scores, but low-quality segmentations can be excluded with QC-based thresholding. Additionally, performing CT-based neuroanatomical studies is encouraged, as the results show correlations in sex- and age-based analyses similar to those found with MRI.

6/26/2024