Benchmarking Embedding Aggregation Methods in Computational Pathology: A Clinical Data Perspective

Read original: arXiv:2407.07841 - Published 7/11/2024 by Shengjia Chen, Gabriele Campanella, Abdulkadir Elmas, Aryeh Stock, Jennifer Zeng, Alexandros D. Polydorides, Adam J. Schoenfeld, Kuan-lin Huang, Jane Houldsworth, Chad Vanderbilt and 1 other

Benchmarking Embedding Aggregation Methods in Computational Pathology: A Clinical Data Perspective

Overview

This paper explores different methods for aggregating image embeddings in the context of computational pathology, with a focus on clinical data.
The authors benchmark various embedding aggregation techniques and assess their performance on several pathology-related tasks, including disease classification, survival prediction, and image retrieval.
The goal is to provide insights into the most effective approaches for leveraging image embeddings in computational pathology, which is an important field for supporting clinical decision-making.

Plain English Explanation

The paper looks at different ways of combining, or "aggregating," the numerical representations (called "embeddings") of medical images in the field of computational pathology. Computational pathology is the use of computer algorithms to analyze medical images, like those taken during a biopsy, to help doctors make better decisions about a patient's health.

The authors test out several aggregation methods and see how well they perform on tasks like predicting a patient's disease or their chances of survival. This helps identify the most effective ways to use image embeddings in computational pathology, which could lead to improved tools for supporting clinical decision-making.

Technical Explanation

The paper evaluates various embedding aggregation methods in the context of computational pathology tasks, such as disease classification, survival prediction, and image retrieval.

The authors assess aggregation techniques including max pooling, attention-based pooling, and self-supervised learning on several public clinical datasets.

The goal is to identify the most effective approaches for leveraging image embeddings in computational pathology, which is a growing field with applications in supporting clinical decision-making.

Critical Analysis

The paper provides a thorough and rigorous evaluation of embedding aggregation methods in computational pathology. However, the authors acknowledge several limitations, such as the reliance on relatively small public datasets and the need for further testing on more diverse clinical data.

Additionally, the paper does not explore the potential biases or ethical implications of using these techniques in a clinical setting. As with any AI-powered tool for healthcare, there are important considerations around fairness, transparency, and the potential for unintended consequences that warrant further investigation.

Conclusion

This study offers valuable insights into the most effective ways of leveraging image embeddings in computational pathology. The benchmarking results can help guide the development of more accurate and reliable computer-aided decision support tools for clinicians.

However, the findings should be considered in the context of the noted limitations, and future research should address the broader ethical and societal impact of deploying such technologies in healthcare. Continued collaboration between machine learning researchers and domain experts will be crucial for responsibly advancing the field of computational pathology.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Benchmarking Embedding Aggregation Methods in Computational Pathology: A Clinical Data Perspective

Shengjia Chen, Gabriele Campanella, Abdulkadir Elmas, Aryeh Stock, Jennifer Zeng, Alexandros D. Polydorides, Adam J. Schoenfeld, Kuan-lin Huang, Jane Houldsworth, Chad Vanderbilt, Thomas J. Fuchs

Recent advances in artificial intelligence (AI), in particular self-supervised learning of foundation models (FMs), are revolutionizing medical imaging and computational pathology (CPath). A constant challenge in the analysis of digital Whole Slide Images (WSIs) is the problem of aggregating tens of thousands of tile-level image embeddings to a slide-level representation. Due to the prevalent use of datasets created for genomic research, such as TCGA, for method development, the performance of these techniques on diagnostic slides from clinical practice has been inadequately explored. This study conducts a thorough benchmarking analysis of ten slide-level aggregation techniques across nine clinically relevant tasks, including diagnostic assessment, biomarker classification, and outcome prediction. The results yield following key insights: (1) Embeddings derived from domain-specific (histological images) FMs outperform those from generic ImageNet-based models across aggregation methods. (2) Spatial-aware aggregators enhance the performance significantly when using ImageNet pre-trained models but not when using FMs. (3) No single model excels in all tasks and spatially-aware models do not show general superiority as it would be expected. These findings underscore the need for more adaptable and universally applicable aggregation techniques, guiding future research towards tools that better meet the evolving needs of clinical-AI in pathology. The code used in this work is available at url{https://github.com/fuchs-lab-public/CPath_SABenchmark}.

7/11/2024

🤖

Beyond Multiple Instance Learning: Full Resolution All-In-Memory End-To-End Pathology Slide Modeling

Gabriele Campanella, Eugene Fluder, Jennifer Zeng, Chad Vanderbilt, Thomas J. Fuchs

Artificial Intelligence (AI) has great potential to improve health outcomes by training systems on vast digitized clinical datasets. Computational Pathology, with its massive amounts of microscopy image data and impact on diagnostics and biomarkers, is at the forefront of this development. Gigapixel pathology slides pose a unique challenge due to their enormous size and are usually divided into tens of thousands of smaller tiles for analysis. This results in a discontinuity in the machine learning process by separating the training of tile-level encoders from slide-level aggregators and the need to adopt weakly supervised learning strategies. Training models from entire pathology slides end-to-end has been largely unexplored due to its computational challenges. To overcome this problem, we propose a novel approach to jointly train both a tile encoder and a slide-aggregator fully in memory and end-to-end at high-resolution, bridging the gap between input and slide-level supervision. While more computationally expensive, detailed quantitative validation shows promise for large-scale pre-training and fine-tuning of pathology foundation models.

5/24/2024

🖼️

Zero-Shot Whole Slide Image Retrieval in Histopathology Using Embeddings of Foundation Models

Saghir Alfasly, Ghazal Alabtah, Sobhan Hemati, Krishna Rani Kalari, H. R. Tizhoosh

We have tested recently published foundation models for histopathology for image retrieval. We report macro average of F1 score for top-1 retrieval, majority of top-3 retrievals, and majority of top-5 retrievals. We perform zero-shot retrievals, i.e., we do not alter embeddings and we do not train any classifier. As test data, we used diagnostic slides of TCGA, The Cancer Genome Atlas, consisting of 23 organs and 117 cancer subtypes. As a search platform we used Yottixel that enabled us to perform WSI search using patches. Achieved F1 scores show low performance, e.g., for top-5 retrievals, 27% +/- 13% (Yottixel-DenseNet), 42% +/- 14% (Yottixel-UNI), 40%+/-13% (Yottixel-Virchow), 41%+/-13% (Yottixel-GigaPath), and 41%+/-14% (GigaPath WSI).

9/14/2024

A Multimodal Knowledge-enhanced Whole-slide Pathology Foundation Model

Yingxue Xu, Yihui Wang, Fengtao Zhou, Jiabo Ma, Shu Yang, Huangjing Lin, Xin Wang, Jiguang Wang, Li Liang, Anjia Han, Ronald Cheong Kin Chan, Hao Chen

Remarkable strides in computational pathology have been made in the task-agnostic foundation model that advances the performance of a wide array of downstream clinical tasks. Despite the promising performance, there are still several challenges. First, prior works have resorted to either vision-only or vision-captions data, disregarding invaluable pathology reports and gene expression profiles which respectively offer distinct knowledge for versatile clinical applications. Second, the current progress in pathology FMs predominantly concentrates on the patch level, where the restricted context of patch-level pretraining fails to capture whole-slide patterns. Here we curated the largest multimodal dataset consisting of H&E diagnostic whole slide images and their associated pathology reports and RNA-Seq data, resulting in 26,169 slide-level modality pairs from 10,275 patients across 32 cancer types. To leverage these data for CPath, we propose a novel whole-slide pretraining paradigm which injects multimodal knowledge at the whole-slide context into the pathology FM, called Multimodal Self-TAught PRetraining (mSTAR). The proposed paradigm revolutionizes the workflow of pretraining for CPath, which enables the pathology FM to acquire the whole-slide context. To our knowledge, this is the first attempt to incorporate multimodal knowledge at the slide level for enhancing pathology FMs, expanding the modelling context from unimodal to multimodal knowledge and from patch-level to slide-level. To systematically evaluate the capabilities of mSTAR, extensive experiments including slide-level unimodal and multimodal applications, are conducted across 7 diverse types of tasks on 43 subtasks, resulting in the largest spectrum of downstream tasks. The average performance in various slide-level applications consistently demonstrates significant performance enhancements for mSTAR compared to SOTA FMs.

7/23/2024