PathoDuet: Foundation Models for Pathological Slide Analysis of H&E and IHC Stains

Read original: arXiv:2312.09894 - Published 8/7/2024 by Shengyi Hua, Fang Yan, Tianle Shen, Lei Ma, Xiaofan Zhang

PathoDuet: Foundation Models for Pathological Slide Analysis of H&E and IHC Stains

Overview

PathoDuet is a research paper that explores using foundation models for analyzing pathological slide images stained with H&E (hematoxylin and eosin) and IHC (immunohistochemistry) stains.
The paper proposes a novel self-supervised pretraining approach to learn informative visual representations from pathological slide data.
The researchers demonstrate the effectiveness of their approach on various downstream tasks, including classification, segmentation, and retrieval.

Plain English Explanation

The paper presents a new technique called PathoDuet that uses advanced AI models, called "foundation models," to analyze medical images of tissue samples from patients. These tissue samples are stained with special dyes called H&E and IHC, which help pathologists (doctors who study diseases) identify different structures and components in the tissue.

The researchers developed a way to train the foundation models to learn useful visual patterns and features from these stained tissue images, without needing detailed labels or annotations. This "self-supervised" approach allows the models to discover relevant information on their own, similar to how humans can recognize patterns and structures in images without being explicitly told what they are.

By using this self-supervised pretraining, the PathoDuet models are then able to perform better on a variety of important tasks in pathology, such as accurately classifying the type of tissue, precisely segmenting different regions of the tissue, and quickly finding similar tissue samples from a large database. This could help pathologists work more efficiently and make more accurate diagnoses, ultimately improving patient care.

Technical Explanation

The PathoDuet paper proposes a novel self-supervised pretraining approach to learn informative visual representations from pathological slide data stained with H&E and IHC. The key innovations include:

Multimodal Pretraining: The researchers jointly pretrain a single foundation model on both H&E and IHC stained slides, allowing the model to learn cross-modal relationships and extract more comprehensive visual features.
Contrastive Learning: They employ a contrastive learning objective to push the model to learn discriminative representations by contrasting positive and negative image pairs during pretraining.
Multitask Pretraining: In addition to the main contrastive objective, the model is trained on auxiliary tasks like slide-level classification and patch-level segmentation to further enrich the learned representations.
Efficient Fine-tuning: The pretrained PathoDuet model can be efficiently fine-tuned on various downstream tasks, such as classification, segmentation, and retrieval, by leveraging the strong visual features learned during pretraining.

The researchers extensively evaluate the PathoDuet approach on multiple public datasets, demonstrating its superior performance compared to prior methods. They also provide ablation studies to validate the importance of the key design choices.

Critical Analysis

The PathoDuet paper presents a well-designed and thoroughly evaluated approach for leveraging foundation models in the domain of pathological slide analysis. The researchers have carefully considered the unique challenges of this domain and proposed innovative solutions to address them.

One potential limitation is the reliance on public datasets, which may not fully capture the diversity and complexity of real-world pathological slides. The researchers acknowledge this and suggest the need for further evaluation on more diverse, clinically-relevant datasets.

Additionally, while the self-supervised pretraining approach is effective, the paper does not explore the interpretability of the learned representations. Understanding the underlying visual patterns and features discovered by the model could provide valuable insights for pathologists and further improve the trust and adoption of such AI-powered tools in clinical practice.

Conclusion

The PathoDuet paper introduces a promising approach for applying foundation models to the analysis of pathological slide images. By leveraging self-supervised pretraining on multimodal (H&E and IHC) data, the researchers have developed a versatile model that can be efficiently adapted to a variety of important tasks in digital pathology.

The demonstrated performance improvements across classification, segmentation, and retrieval tasks suggest that the PathoDuet framework has the potential to enhance the efficiency and accuracy of pathological slide analysis, ultimately contributing to improved patient outcomes. Further research exploring the interpretability of the learned representations and the robustness of the approach on diverse, real-world datasets could further strengthen the impact of this work.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

PathoDuet: Foundation Models for Pathological Slide Analysis of H&E and IHC Stains

Shengyi Hua, Fang Yan, Tianle Shen, Lei Ma, Xiaofan Zhang

Large amounts of digitized histopathological data display a promising future for developing pathological foundation models via self-supervised learning methods. Foundation models pretrained with these methods serve as a good basis for downstream tasks. However, the gap between natural and histopathological images hinders the direct application of existing methods. In this work, we present PathoDuet, a series of pretrained models on histopathological images, and a new self-supervised learning framework in histopathology. The framework is featured by a newly-introduced pretext token and later task raisers to explicitly utilize certain relations between images, like multiple magnifications and multiple stains. Based on this, two pretext tasks, cross-scale positioning and cross-stain transferring, are designed to pretrain the model on Hematoxylin and Eosin (H&E) images and transfer the model to immunohistochemistry (IHC) images, respectively. To validate the efficacy of our models, we evaluate the performance over a wide variety of downstream tasks, including patch-level colorectal cancer subtyping and whole slide image (WSI)-level classification in H&E field, together with expression level prediction of IHC marker, tumor identification and slide-level qualitative analysis in IHC field. The experimental results show the superiority of our models over most tasks and the efficacy of proposed pretext tasks. The codes and models are available at https://github.com/openmedlab/PathoDuet.

8/7/2024

A Multimodal Knowledge-enhanced Whole-slide Pathology Foundation Model

Yingxue Xu, Yihui Wang, Fengtao Zhou, Jiabo Ma, Shu Yang, Huangjing Lin, Xin Wang, Jiguang Wang, Li Liang, Anjia Han, Ronald Cheong Kin Chan, Hao Chen

Remarkable strides in computational pathology have been made in the task-agnostic foundation model that advances the performance of a wide array of downstream clinical tasks. Despite the promising performance, there are still several challenges. First, prior works have resorted to either vision-only or vision-captions data, disregarding invaluable pathology reports and gene expression profiles which respectively offer distinct knowledge for versatile clinical applications. Second, the current progress in pathology FMs predominantly concentrates on the patch level, where the restricted context of patch-level pretraining fails to capture whole-slide patterns. Here we curated the largest multimodal dataset consisting of H&E diagnostic whole slide images and their associated pathology reports and RNA-Seq data, resulting in 26,169 slide-level modality pairs from 10,275 patients across 32 cancer types. To leverage these data for CPath, we propose a novel whole-slide pretraining paradigm which injects multimodal knowledge at the whole-slide context into the pathology FM, called Multimodal Self-TAught PRetraining (mSTAR). The proposed paradigm revolutionizes the workflow of pretraining for CPath, which enables the pathology FM to acquire the whole-slide context. To our knowledge, this is the first attempt to incorporate multimodal knowledge at the slide level for enhancing pathology FMs, expanding the modelling context from unimodal to multimodal knowledge and from patch-level to slide-level. To systematically evaluate the capabilities of mSTAR, extensive experiments including slide-level unimodal and multimodal applications, are conducted across 7 diverse types of tasks on 43 subtasks, resulting in the largest spectrum of downstream tasks. The average performance in various slide-level applications consistently demonstrates significant performance enhancements for mSTAR compared to SOTA FMs.

7/23/2024

Multistain Pretraining for Slide Representation Learning in Pathology

Guillaume Jaume, Anurag Vaidya, Andrew Zhang, Andrew H. Song, Richard J. Chen, Sharifa Sahai, Dandan Mo, Emilio Madrigal, Long Phi Le, Faisal Mahmood

Developing self-supervised learning (SSL) models that can learn universal and transferable representations of H&E gigapixel whole-slide images (WSIs) is becoming increasingly valuable in computational pathology. These models hold the potential to advance critical tasks such as few-shot classification, slide retrieval, and patient stratification. Existing approaches for slide representation learning extend the principles of SSL from small images (e.g., 224 x 224 patches) to entire slides, usually by aligning two different augmentations (or views) of the slide. Yet the resulting representation remains constrained by the limited clinical and biological diversity of the views. Instead, we postulate that slides stained with multiple markers, such as immunohistochemistry, can be used as different views to form a rich task-agnostic training signal. To this end, we introduce Madeleine, a multimodal pretraining strategy for slide representation learning. Madeleine is trained with a dual global-local cross-stain alignment objective on large cohorts of breast cancer samples (N=4,211 WSIs across five stains) and kidney transplant samples (N=12,070 WSIs across four stains). We demonstrate the quality of slide representations learned by Madeleine on various downstream evaluations, ranging from morphological and molecular classification to prognostic prediction, comprising 21 tasks using 7,299 WSIs from multiple medical centers. Code is available at https://github.com/mahmoodlab/MADELEINE.

8/7/2024

📈

RudolfV: A Foundation Model by Pathologists for Pathologists

Jonas Dippel, Barbara Feulner, Tobias Winterhoff, Timo Milbich, Stephan Tietz, Simon Schallenberg, Gabriel Dernbach, Andreas Kunft, Simon Heinke, Marie-Lisa Eich, Julika Ribbat-Idel, Rosemarie Krupar, Philipp Anders, Niklas Preni{ss}l, Philipp Jurmeister, David Horst, Lukas Ruff, Klaus-Robert Muller, Frederick Klauschen, Maximilian Alber

Artificial intelligence has started to transform histopathology impacting clinical diagnostics and biomedical research. However, while many computational pathology approaches have been proposed, most current AI models are limited with respect to generalization, application variety, and handling rare diseases. Recent efforts introduced self-supervised foundation models to address these challenges, yet existing approaches do not leverage pathologist knowledge by design. In this study, we present a novel approach to designing foundation models for computational pathology, incorporating pathologist expertise, semi-automated data curation, and a diverse dataset from over 15 laboratories, including 58 tissue types, and encompassing 129 different histochemical and immunohistochemical staining modalities. We demonstrate that our model RudolfV surpasses existing state-of-the-art foundation models across different benchmarks focused on tumor microenvironment profiling, biomarker evaluation, and reference case search while exhibiting favorable robustness properties. Our study shows how domain-specific knowledge can increase the efficiency and performance of pathology foundation models and enable novel application areas.

6/12/2024