Large Language Model-Augmented Auto-Delineation of Treatment Target Volume in Radiation Therapy

Read original: arXiv:2407.07296 - Published 7/11/2024 by Praveenbalaji Rajendran, Yong Yang, Thomas R. Niedermayr, Michael Gensheimer, Beth Beadle, Quynh-Thu Le, Lei Xing, Xianjin Dai

💬

Overview

Radiation therapy is a crucial cancer treatment, but accurate delineation of target areas is challenging and relies on manual processes by experts.
Advancements in artificial intelligence (AI) have improved auto-contouring of normal tissues, but accurately delineating radiation therapy (RT) target volumes remains difficult.
The study proposes a Radformer model, a visual language model-based RT target volume auto-delineation network, to address this challenge.

Plain English Explanation

Radiation therapy is one of the most effective ways to treat cancer, but to make it work, doctors need to accurately identify the specific areas of the body that need treatment. Currently, this is done manually by human experts, which is time-consuming, tedious, and can vary between different people doing the work.

While AI has helped automate the process of identifying normal, healthy tissues, accurately defining the cancer target areas for radiation therapy is still a major challenge. The researchers in this study developed a new AI model called the Radformer to try to solve this problem.

The Radformer uses a combination of visual and language features to automatically delineate, or outline, the radiation therapy target volumes. It takes advantage of advances in large language models and multimodal AI to extract relevant information from clinical data.

The researchers tested the Radformer on a large dataset of head and neck cancer patients who underwent radiation therapy. The results showed that the Radformer outperformed other state-of-the-art models in accurately defining the radiation therapy target areas. This suggests the Radformer could be a valuable tool to help streamline and standardize this crucial step in radiation therapy for cancer treatment.

Technical Explanation

The study proposes the Radformer, a visual language model-based radiation therapy (RT) target volume auto-delineation network. The Radformer utilizes a hierarchical vision transformer as its backbone and incorporates large language models to extract text-rich features from clinical data.

A key innovation is the visual language attention module (VLAM), which integrates visual and linguistic features to enable language-aware visual encoding (LAVE). This allows the model to jointly leverage visual and textual information for more accurate target volume delineation.

The Radformer was evaluated on a dataset of 2,985 head-and-neck cancer patients who underwent radiation therapy. Quantitative metrics like the Dice similarity coefficient, intersection over union, and Hausdorff distance were used to assess the model's performance.

The results demonstrate that the Radformer outperforms other state-of-the-art models in segmentation accuracy, validating its potential for adoption in RT practice.

Critical Analysis

The paper provides a thorough technical evaluation of the Radformer model and its performance on a large, real-world dataset of head-and-neck cancer patients. However, the authors acknowledge several limitations and areas for further research.

For example, the model was only tested on head-and-neck cancer cases, so its generalizability to other cancer types and anatomical regions is unclear. Additionally, the dataset used for training and evaluation was retrospective, so prospective validation on new patient data would be valuable.

The authors also note that the Radformer currently relies on manual segmentation of normal tissues as input, which could introduce bias and variability. Developing end-to-end solutions that automate the entire target volume delineation process would be an important next step.

Furthermore, the clinical impact and practical implementation of the Radformer in actual radiation therapy workflows are not explored in depth. Assessing factors like workflow integration, user experience, and impact on treatment outcomes would be crucial for real-world deployment.

Conclusion

This study presents the Radformer, a novel AI model that demonstrates promising results in automating the delineation of radiation therapy target volumes, a crucial step in cancer treatment. By combining visual and language-based features, the Radformer offers a potentially valuable tool to streamline and standardize this manual process, which could lead to improved efficiency and consistency in radiation therapy planning and delivery.

While further research is needed to address the identified limitations, the Radformer's strong performance on a large clinical dataset suggests it has the potential to significantly impact radiation oncology practice if successfully integrated into real-world workflows.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

💬

Large Language Model-Augmented Auto-Delineation of Treatment Target Volume in Radiation Therapy

Praveenbalaji Rajendran, Yong Yang, Thomas R. Niedermayr, Michael Gensheimer, Beth Beadle, Quynh-Thu Le, Lei Xing, Xianjin Dai

Radiation therapy (RT) is one of the most effective treatments for cancer, and its success relies on the accurate delineation of targets. However, target delineation is a comprehensive medical decision that currently relies purely on manual processes by human experts. Manual delineation is time-consuming, laborious, and subject to interobserver variations. Although the advancements in artificial intelligence (AI) techniques have significantly enhanced the auto-contouring of normal tissues, accurate delineation of RT target volumes remains a challenge. In this study, we propose a visual language model-based RT target volume auto-delineation network termed Radformer. The Radformer utilizes a hierarichal vision transformer as the backbone and incorporates large language models to extract text-rich features from clinical data. We introduce a visual language attention module (VLAM) for integrating visual and linguistic features for language-aware visual encoding (LAVE). The Radformer has been evaluated on a dataset comprising 2985 patients with head-and-neck cancer who underwent RT. Metrics, including the Dice similarity coefficient (DSC), intersection over union (IOU), and 95th percentile Hausdorff distance (HD95), were used to evaluate the performance of the model quantitatively. Our results demonstrate that the Radformer has superior segmentation performance compared to other state-of-the-art models, validating its potential for adoption in RT practice.

7/11/2024

➖

LLM-driven Multimodal Target Volume Contouring in Radiation Oncology

Yujin Oh, Sangjoon Park, Hwa Kyung Byun, Yeona Cho, Ik Jae Lee, Jin Sung Kim, Jong Chul Ye

Target volume contouring for radiation therapy is considered significantly more challenging than the normal organ segmentation tasks as it necessitates the utilization of both image and text-based clinical information. Inspired by the recent advancement of large language models (LLMs) that can facilitate the integration of the textural information and images, here we present a novel LLM-driven multimodal AI, namely LLMSeg, that utilizes the clinical text information and is applicable to the challenging task of target volume contouring for radiation therapy, and validate it within the context of breast cancer radiation therapy target volume contouring. Using external validation and data-insufficient environments, which attributes highly conducive to real-world applications, we demonstrate that the proposed model exhibits markedly improved performance compared to conventional unimodal AI models, particularly exhibiting robust generalization performance and data efficiency. To our best knowledge, this is the first LLM-driven multimodal AI model that integrates the clinical text information into target volume delineation for radiation oncology.

4/16/2024

📈

Quality assurance of organs-at-risk delineation in radiotherapy

Yihao Zhao, Cuiyun Yuan, Ying Liang, Yang Li, Chunxia Li, Man Zhao, Jun Hu, Wei Liu, Chenbin Liu

The delineation of tumor target and organs-at-risk is critical in the radiotherapy treatment planning. Automatic segmentation can be used to reduce the physician workload and improve the consistency. However, the quality assurance of the automatic segmentation is still an unmet need in clinical practice. The patient data used in our study was a standardized dataset from AAPM Thoracic Auto-Segmentation Challenge. The OARs included were left and right lungs, heart, esophagus, and spinal cord. Two groups of OARs were generated, the benchmark dataset manually contoured by experienced physicians and the test dataset automatically created using a software AccuContour. A resnet-152 network was performed as feature extractor, and one-class support vector classifier was used to determine the high or low quality. We evaluate the model performance with balanced accuracy, F-score, sensitivity, specificity and the area under the receiving operator characteristic curve. We randomly generated contour errors to assess the generalization of our method, explored the detection limit, and evaluated the correlations between detection limit and various metrics such as volume, Dice similarity coefficient, Hausdorff distance, and mean surface distance. The proposed one-class classifier outperformed in metrics such as balanced accuracy, AUC, and others. The proposed method showed significant improvement over binary classifiers in handling various types of errors. Our proposed model, which introduces residual network and attention mechanism in the one-class classification framework, was able to detect the various types of OAR contour errors with high accuracy. The proposed method can significantly reduce the burden of physician review for contour delineation.

5/21/2024

🌐

ARANet: Attention-based Residual Adversarial Network with Deep Supervision for Radiotherapy Dose Prediction of Cervical Cancer

Lu Wen, Wenxia Yin, Zhenghao Feng, Xi Wu, Deng Xiong, Yan Wang

Radiation therapy is the mainstay treatment for cervical cancer, and its ultimate goal is to ensure the planning target volume (PTV) reaches the prescribed dose while reducing dose deposition of organs-at-risk (OARs) as much as possible. To achieve these clinical requirements, the medical physicist needs to manually tweak the radiotherapy plan repeatedly in a trial-anderror manner until finding the optimal one in the clinic. However, such trial-and-error processes are quite time-consuming, and the quality of plans highly depends on the experience of the medical physicist. In this paper, we propose an end-to-end Attentionbased Residual Adversarial Network with deep supervision, namely ARANet, to automatically predict the 3D dose distribution of cervical cancer. Specifically, given the computer tomography (CT) images and their corresponding segmentation masks of PTV and OARs, ARANet employs a prediction network to generate the dose maps. We also utilize a multi-scale residual attention module and deep supervision mechanism to enforce the prediction network to extract more valuable dose features while suppressing irrelevant information. Our proposed method is validated on an in-house dataset including 54 cervical cancer patients, and experimental results have demonstrated its obvious superiority compared to other state-of-the-art methods.

8/27/2024