FairSeg: A Large-Scale Medical Image Segmentation Dataset for Fairness Learning Using Segment Anything Model with Fair Error-Bound Scaling

Read original: arXiv:2311.02189 - Published 5/2/2024 by Yu Tian, Min Shi, Yan Luo, Ava Kouhana, Tobias Elze, Mengyu Wang

🖼️

Overview

Fairness in AI models, especially in medical applications, is critical for people's well-being and lives.
Existing medical fairness datasets are limited to classification tasks, but medical segmentation is an equally important clinical task.
The authors propose the first fairness dataset for medical segmentation, called Harvard-FairSeg, and a fair error-bound scaling approach to improve segmentation performance equity.

Plain English Explanation

AI models are increasingly being used in medical applications, and it's crucial that these models treat all people fairly, regardless of their background or characteristics. Existing fairness datasets for medical AI focus on classification tasks, like diagnosing diseases, but there's also an important need for fairness in medical image segmentation, which is the process of identifying and outlining specific structures or abnormalities in medical images.

The researchers created the first fairness dataset for medical image segmentation, called Harvard-FairSeg, which contains 10,000 medical images and information about the patients' demographics. This dataset can be used to train and evaluate AI models to ensure they segment medical images accurately and fairly for all patients.

The researchers also developed a new technique called "fair error-bound scaling" that aims to improve the fairness of AI models for medical image segmentation. This approach adjusts the training process to focus more on the "hard cases" where the model is struggling the most for certain groups of patients, in order to ensure the model performs equally well for everyone.

By creating this new fairness dataset and technique, the researchers hope to make significant progress in ensuring AI models used in medical settings are fair and equitable for all patients, which is crucial for people's health and well-being.

Technical Explanation

The authors propose the first fairness dataset for medical image segmentation, called Harvard-FairSeg, which contains 10,000 subject samples. This is an important contribution, as existing medical fairness datasets are limited to classification tasks, while medical segmentation is an equally critical clinical task that can provide detailed spatial information on organ abnormalities.

To address fairness in medical segmentation, the authors also propose a "fair error-bound scaling" approach. This technique uses the Segment Anything Model (SAM) and reweights the loss function based on the upper error-bound in each identity group. The goal is to explicitly focus on the "hard cases" with high training errors in each group, in order to improve the segmentation performance equity.

To evaluate fairness, the authors introduce a novel "equity-scaled segmentation performance metric" that compares segmentation metrics like the Dice coefficient in the context of fairness. Through comprehensive experiments, they demonstrate that their fair error-bound scaling approach either outperforms or is comparable to state-of-the-art fairness learning models, such as FairCLIP and transfer learning with multi-task learning.

Critical Analysis

The authors acknowledge that their fairness dataset and technique are a first step towards addressing fairness in medical image segmentation, and there is still room for improvement. For example, the dataset could be expanded to include more diversity in terms of demographics and medical conditions.

Additionally, the fair error-bound scaling approach focuses on improving performance equity, but it doesn't directly address other fairness aspects, such as representational fairness or causal fairness. Further research is needed to develop comprehensive fairness-aware techniques for medical image segmentation.

It's also important to note that fairness in AI is a complex and multifaceted challenge, and there may be inherent trade-offs between different fairness objectives. The authors' work highlights the importance of continued research and collaboration to address these issues and ensure AI systems used in healthcare are truly fair and equitable for all patients.

Conclusion

The authors have made an important contribution by creating the first fairness dataset for medical image segmentation and proposing a novel fair error-bound scaling approach to improve segmentation performance equity. This work represents a significant step forward in promoting fairness in AI models used in critical medical applications, which is crucial for people's health and well-being. As the use of AI in healthcare continues to grow, it will be essential to build on this research and develop even more comprehensive fairness-aware techniques to ensure these powerful tools benefit all patients equally.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🖼️

FairSeg: A Large-Scale Medical Image Segmentation Dataset for Fairness Learning Using Segment Anything Model with Fair Error-Bound Scaling

Yu Tian, Min Shi, Yan Luo, Ava Kouhana, Tobias Elze, Mengyu Wang

Fairness in artificial intelligence models has gained significantly more attention in recent years, especially in the area of medicine, as fairness in medical models is critical to people's well-being and lives. High-quality medical fairness datasets are needed to promote fairness learning research. Existing medical fairness datasets are all for classification tasks, and no fairness datasets are available for medical segmentation, while medical segmentation is an equally important clinical task as classifications, which can provide detailed spatial information on organ abnormalities ready to be assessed by clinicians. In this paper, we propose the first fairness dataset for medical segmentation named Harvard-FairSeg with 10,000 subject samples. In addition, we propose a fair error-bound scaling approach to reweight the loss function with the upper error-bound in each identity group, using the segment anything model (SAM). We anticipate that the segmentation performance equity can be improved by explicitly tackling the hard cases with high training errors in each identity group. To facilitate fair comparisons, we utilize a novel equity-scaled segmentation performance metric to compare segmentation metrics in the context of fairness, such as the equity-scaled Dice coefficient. Through comprehensive experiments, we demonstrate that our fair error-bound scaling approach either has superior or comparable fairness performance to the state-of-the-art fairness learning models. The dataset and code are publicly accessible via https://ophai.hms.harvard.edu/datasets/harvard-fairseg10k.

5/2/2024

An Empirical Study on the Fairness of Foundation Models for Multi-Organ Image Segmentation

Qin Li, Yizhe Zhang, Yan Li, Jun Lyu, Meng Liu, Longyu Sun, Mengting Sun, Qirong Li, Wenyue Mao, Xinran Wu, Yajing Zhang, Yinghua Chu, Shuo Wang, Chengyan Wang

The segmentation foundation model, e.g., Segment Anything Model (SAM), has attracted increasing interest in the medical image community. Early pioneering studies primarily concentrated on assessing and improving SAM's performance from the perspectives of overall accuracy and efficiency, yet little attention was given to the fairness considerations. This oversight raises questions about the potential for performance biases that could mirror those found in task-specific deep learning models like nnU-Net. In this paper, we explored the fairness dilemma concerning large segmentation foundation models. We prospectively curate a benchmark dataset of 3D MRI and CT scans of the organs including liver, kidney, spleen, lung and aorta from a total of 1056 healthy subjects with expert segmentations. Crucially, we document demographic details such as gender, age, and body mass index (BMI) for each subject to facilitate a nuanced fairness analysis. We test state-of-the-art foundation models for medical image segmentation, including the original SAM, medical SAM and SAT models, to evaluate segmentation efficacy across different demographic groups and identify disparities. Our comprehensive analysis, which accounts for various confounding factors, reveals significant fairness concerns within these foundational models. Moreover, our findings highlight not only disparities in overall segmentation metrics, such as the Dice Similarity Coefficient but also significant variations in the spatial distribution of segmentation errors, offering empirical evidence of the nuanced challenges in ensuring fairness in medical image segmentation.

6/19/2024

FairDomain: Achieving Fairness in Cross-Domain Medical Image Segmentation and Classification

Yu Tian, Congcong Wen, Min Shi, Muhammad Muneeb Afzal, Hao Huang, Muhammad Osama Khan, Yan Luo, Yi Fang, Mengyu Wang

Addressing fairness in artificial intelligence (AI), particularly in medical AI, is crucial for ensuring equitable healthcare outcomes. Recent efforts to enhance fairness have introduced new methodologies and datasets in medical AI. However, the fairness issue under the setting of domain transfer is almost unexplored, while it is common that clinics rely on different imaging technologies (e.g., different retinal imaging modalities) for patient diagnosis. This paper presents FairDomain, a pioneering systemic study into algorithmic fairness under domain shifts, employing state-of-the-art domain adaptation (DA) and generalization (DG) algorithms for both medical segmentation and classification tasks to understand how biases are transferred between different domains. We also introduce a novel plug-and-play fair identity attention (FIA) module that adapts to various DA and DG algorithms to improve fairness by using self-attention to adjust feature importance based on demographic attributes. Additionally, we curate the first fairness-focused dataset with two paired imaging modalities for the same patient cohort on medical segmentation and classification tasks, to rigorously assess fairness in domain-shift scenarios. Excluding the confounding impact of demographic distribution variation between source and target domains will allow clearer quantification of the performance of domain transfer models. Our extensive evaluations reveal that the proposed FIA significantly enhances both model performance accounted for fairness across all domain shift settings (i.e., DA and DG) with respect to different demographics, which outperforms existing methods on both segmentation and classification. The code and data can be accessed at https://ophai.hms.harvard.edu/datasets/harvard-fairdomain20k.

7/22/2024

FairVision: Equitable Deep Learning for Eye Disease Screening via Fair Identity Scaling

Yan Luo, Muhammad Osama Khan, Yu Tian, Min Shi, Zehao Dou, Tobias Elze, Yi Fang, Mengyu Wang

Equity in AI for healthcare is crucial due to its direct impact on human well-being. Despite advancements in 2D medical imaging fairness, the fairness of 3D models remains underexplored, hindered by the small sizes of 3D fairness datasets. Since 3D imaging surpasses 2D imaging in SOTA clinical care, it is critical to understand the fairness of these 3D models. To address this research gap, we conduct the first comprehensive study on the fairness of 3D medical imaging models across multiple protected attributes. Our investigation spans both 2D and 3D models and evaluates fairness across five architectures on three common eye diseases, revealing significant biases across race, gender, and ethnicity. To alleviate these biases, we propose a novel fair identity scaling (FIS) method that improves both overall performance and fairness, outperforming various SOTA fairness methods. Moreover, we release Harvard-FairVision, the first large-scale medical fairness dataset with 30,000 subjects featuring both 2D and 3D imaging data and six demographic identity attributes. Harvard-FairVision provides labels for three major eye disorders affecting about 380 million people worldwide, serving as a valuable resource for both 2D and 3D fairness learning. Our code and dataset are publicly accessible at url{https://ophai.hms.harvard.edu/datasets/harvard-fairvision30k}.

4/15/2024