AbdomenAtlas: A Large-Scale, Detailed-Annotated, & Multi-Center Dataset for Efficient Transfer Learning and Open Algorithmic Benchmarking

Read original: arXiv:2407.16697 - Published 7/24/2024 by Wenxuan Li, Chongyu Qu, Xiaoxi Chen, Pedro R. A. S. Bassi, Yijia Shi, Yuxiang Lai, Qian Yu, Huimin Xue, Yixiong Chen, Xiaorui Lin and 11 others

AbdomenAtlas: A Large-Scale, Detailed-Annotated, & Multi-Center Dataset for Efficient Transfer Learning and Open Algorithmic Benchmarking

Overview

A large, detailed, and multi-center dataset for abdominal imaging called AbdomenAtlas
Designed to enable efficient transfer learning and open algorithmic benchmarking
Covers a wide range of abdominal anatomy with detailed annotations

Plain English Explanation

AbdomenAtlas is a new dataset that provides a comprehensive collection of abdominal medical images. This dataset is unique because it contains a large number of images from multiple healthcare centers, and each image has been carefully annotated to identify different organs and structures within the abdomen.

The key advantage of AbdomenAtlas is that it can be used to help train machine learning models for medical image analysis. By using this dataset, researchers and developers can "transfer" the knowledge learned from these pre-annotated images to build more accurate and robust models for tasks like organ segmentation, disease detection, and treatment planning.

Additionally, the open and standardized nature of the AbdomenAtlas dataset means that it can serve as a common benchmark for different AI algorithms and techniques in the medical imaging domain. Researchers can test their models on the AbdomenAtlas data and compare their performance to other approaches, helping to drive progress in this important field.

Technical Explanation

AbdomenAtlas is a large-scale dataset of abdominal CT scans that provides detailed annotations for over 40 different anatomical structures. The dataset was curated from multiple healthcare centers, ensuring diversity in the patient population and imaging protocols.

Each CT scan in the dataset has been meticulously annotated by expert radiologists, who have delineated the boundaries of organs, blood vessels, and other key structures. This detailed annotation allows for the training of highly accurate machine learning models for tasks like organ segmentation, which is a critical step in many clinical applications.

The size and diversity of the AbdomenAtlas dataset also make it well-suited for transfer learning. By pre-training models on this data and then fine-tuning them on smaller, task-specific datasets, researchers can leverage the rich knowledge captured in AbdomenAtlas to build highly performant models with limited training data.

Additionally, the standardized nature of the AbdomenAtlas dataset enables open algorithmic benchmarking. Researchers can evaluate their models on the AbdomenAtlas data and compare their performance to other state-of-the-art approaches, accelerating progress in the field of medical image analysis.

Critical Analysis

While the AbdomenAtlas dataset represents a significant advancement in the field of abdominal imaging, there are a few potential limitations and caveats to consider:

The dataset is primarily composed of CT scans, which may limit its applicability to other imaging modalities like MRI or ultrasound. Further research is needed to explore the transferability of models trained on AbdomenAtlas to these other domains.
The annotations, while highly detailed, may be subject to some degree of inter-rater variability, as multiple radiologists were involved in the labeling process. Additional validation and consistency checks could help to quantify and potentially mitigate this source of uncertainty.
The dataset is focused on a Western patient population, which may not fully capture the anatomical diversity seen in other regions or demographics. Expanding the dataset to include more global representation could further enhance its utility.

Despite these potential limitations, the AbdomenAtlas dataset represents a valuable resource for the medical imaging community, and its open and standardized nature will likely drive significant progress in the development of more accurate and clinically-relevant AI models.

Conclusion

The AbdomenAtlas dataset is a game-changing resource for the field of medical image analysis. By providing a large, detailed, and multi-center dataset of abdominal CT scans with comprehensive annotations, AbdomenAtlas enables efficient transfer learning and open algorithmic benchmarking.

The potential impact of this dataset is far-reaching, as it could lead to the development of more accurate and robust AI models for a wide range of clinical applications, from organ segmentation to disease detection and treatment planning. As the field of medical imaging continues to evolve, resources like AbdomenAtlas will be crucial in driving the next generation of AI-powered healthcare solutions.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

AbdomenAtlas: A Large-Scale, Detailed-Annotated, & Multi-Center Dataset for Efficient Transfer Learning and Open Algorithmic Benchmarking

Wenxuan Li, Chongyu Qu, Xiaoxi Chen, Pedro R. A. S. Bassi, Yijia Shi, Yuxiang Lai, Qian Yu, Huimin Xue, Yixiong Chen, Xiaorui Lin, Yutong Tang, Yining Cao, Haoqi Han, Zheyuan Zhang, Jiawei Liu, Tiezheng Zhang, Yujiu Ma, Jincheng Wang, Guang Zhang, Alan Yuille, Zongwei Zhou

We introduce the largest abdominal CT dataset (termed AbdomenAtlas) of 20,460 three-dimensional CT volumes sourced from 112 hospitals across diverse populations, geographies, and facilities. AbdomenAtlas provides 673K high-quality masks of anatomical structures in the abdominal region annotated by a team of 10 radiologists with the help of AI algorithms. We start by having expert radiologists manually annotate 22 anatomical structures in 5,246 CT volumes. Following this, a semi-automatic annotation procedure is performed for the remaining CT volumes, where radiologists revise the annotations predicted by AI, and in turn, AI improves its predictions by learning from revised annotations. Such a large-scale, detailed-annotated, and multi-center dataset is needed for two reasons. Firstly, AbdomenAtlas provides important resources for AI development at scale, branded as large pre-trained models, which can alleviate the annotation workload of expert radiologists to transfer to broader clinical applications. Secondly, AbdomenAtlas establishes a large-scale benchmark for evaluating AI algorithms -- the more data we use to test the algorithms, the better we can guarantee reliable performance in complex clinical scenarios. An ISBI & MICCAI challenge named BodyMaps: Towards 3D Atlas of Human Body was launched using a subset of our AbdomenAtlas, aiming to stimulate AI innovation and to benchmark segmentation accuracy, inference efficiency, and domain generalizability. We hope our AbdomenAtlas can set the stage for larger-scale clinical trials and offer exceptional opportunities to practitioners in the medical imaging community. Codes, models, and datasets are available at https://www.zongweiz.com/dataset

7/24/2024

🌀

The RSNA Abdominal Traumatic Injury CT (RATIC) Dataset

Jeffrey D. Rudie, Hui-Ming Lin, Robyn L. Ball, Sabeena Jalal, Luciano M. Prevedello, Savvas Nicolaou, Brett S. Marinelli, Adam E. Flanders, Kirti Magudia, George Shih, Melissa A. Davis, John Mongan, Peter D. Chang, Ferco H. Berger, Sebastiaan Hermans, Meng Law, Tyler Richards, Jan-Peter Grunz, Andreas Steven Kunz, Shobhit Mathur, Sandro Galea-Soler, Andrew D. Chung, Saif Afat, Chin-Chi Kuo, Layal Aweidah, Ana Villanueva Campos, Arjuna Somasundaram, Felipe Antonio Sanchez Tijmes, Attaporn Jantarangkoon, Leonardo Kayat Bittencourt, Michael Brassil, Ayoub El Hajjami, Hakan Dogan, Muris Becircic, Agrahara G. Bharatkumar, Eduardo Moreno J'udice de Mattos Farina, Dataset Curator Group, Dataset Contributor Group, Dataset Annotator Group, Errol Colak

The RSNA Abdominal Traumatic Injury CT (RATIC) dataset is the largest publicly available collection of adult abdominal CT studies annotated for traumatic injuries. This dataset includes 4,274 studies from 23 institutions across 14 countries. The dataset is freely available for non-commercial use via Kaggle at https://www.kaggle.com/competitions/rsna-2023-abdominal-trauma-detection. Created for the RSNA 2023 Abdominal Trauma Detection competition, the dataset encourages the development of advanced machine learning models for detecting abdominal injuries on CT scans. The dataset encompasses detection and classification of traumatic injuries across multiple organs, including the liver, spleen, kidneys, bowel, and mesentery. Annotations were created by expert radiologists from the American Society of Emergency Radiology (ASER) and Society of Abdominal Radiology (SAR). The dataset is annotated at multiple levels, including the presence of injuries in three solid organs with injury grading, image-level annotations for active extravasations and bowel injury, and voxelwise segmentations of each of the potentially injured organs. With the release of this dataset, we hope to facilitate research and development in machine learning and abdominal trauma that can lead to improved patient care and outcomes.

5/31/2024

Automatic Organ and Pan-cancer Segmentation in Abdomen CT: the FLARE 2023 Challenge

Jun Ma, Yao Zhang, Song Gu, Cheng Ge, Ershuai Wang, Qin Zhou, Ziyan Huang, Pengju Lyu, Jian He, Bo Wang

Organ and cancer segmentation in abdomen Computed Tomography (CT) scans is the prerequisite for precise cancer diagnosis and treatment. Most existing benchmarks and algorithms are tailored to specific cancer types, limiting their ability to provide comprehensive cancer analysis. This work presents the first international competition on abdominal organ and pan-cancer segmentation by providing a large-scale and diverse dataset, including 4650 CT scans with various cancer types from over 40 medical centers. The winning team established a new state-of-the-art with a deep learning-based cascaded framework, achieving average Dice Similarity Coefficient scores of 92.3% for organs and 64.9% for lesions on the hidden multi-national testing set. The dataset and code of top teams are publicly available, offering a benchmark platform to drive further innovations https://codalab.lisn.upsaclay.fr/competitions/12239.

8/23/2024

Rethinking Abdominal Organ Segmentation (RAOS) in the clinical scenario: A robustness evaluation benchmark with challenging cases

Xiangde Luo, Zihan Li, Shaoting Zhang, Wenjun Liao, Guotai Wang

Deep learning has enabled great strides in abdominal multi-organ segmentation, even surpassing junior oncologists on common cases or organs. However, robustness on corner cases and complex organs remains a challenging open problem for clinical adoption. To investigate model robustness, we collected and annotated the RAOS dataset comprising 413 CT scans ($sim$80k 2D images, $sim$8k 3D organ annotations) from 413 patients each with 17 (female) or 19 (male) labelled organs, manually delineated by oncologists. We grouped scans based on clinical information into 1) diagnosis/radiotherapy (317 volumes), 2) partial excision without the whole organ missing (22 volumes), and 3) excision with the whole organ missing (74 volumes). RAOS provides a potential benchmark for evaluating model robustness including organ hallucination. It also includes some organs that can be very hard to access on public datasets like the rectum, colon, intestine, prostate and seminal vesicles. We benchmarked several state-of-the-art methods in these three clinical groups to evaluate performance and robustness. We also assessed cross-generalization between RAOS and three public datasets. This dataset and comprehensive analysis establish a potential baseline for future robustness research: url{https://github.com/Luoxd1996/RAOS}.

6/21/2024