Data-driven Nucleus Subclassification on Colon H&E using Style-transferred Digital Pathology

Read original: arXiv:2407.06116 - Published 7/9/2024 by Lucas W. Remedios, Shunxing Bao, Samuel W. Remedios, Ho Hin Lee, Leon Y. Cai, Thomas Li, Ruining Deng, Nancy R. Newlin, Adam M. Saunders, Can Cui and 9 others

👀

Overview

This paper focuses on using inter-modality learning to classify different cell types in colon tissue samples using H&E (hematoxylin and eosin) staining.
The authors take advantage of multiplexed immunofluorescence (MxIF) histology to label 14 cell subclasses, then use style transfer to synthesize realistic virtual H&E images paired with the MxIF-derived cell subclassification labels.
They evaluate a supervised learning approach where the input is the virtual H&E images and the labels are the MxIF-derived cell subclasses, assessing the model's performance on both virtual and real H&E data.

Plain English Explanation

Understanding how different types of cells in the body communicate and interact with each other is crucial for furthering our knowledge of how the human body functions. H&E staining is a widely used technique to visualize cells, but accurately identifying specific cell subtypes often requires expert knowledge and specialized stains.

To make this process easier, the authors of this paper propose using AI to classify different cell types in H&E images. They build on previous work, such as the CoNIC Challenge, which focused on labeling 6 cell types in colon tissue. However, the CoNIC Challenge was unable to classify certain cell subtypes, like epithelial progenitors and lymphocyte subtypes.

To address this, the researchers used a technique called inter-modality learning. They took advantage of a more advanced imaging method called multiplexed immunofluorescence (MxIF), which allowed them to label 14 different cell subclasses. They then used a process called style transfer to create realistic virtual H&E images that matched the MxIF-labeled cell subclasses.

By pairing the virtual H&E images with the MxIF-derived cell subclass labels, the researchers were able to train a supervised learning model to classify cell types in H&E images, including hard-to-identify subtypes like helper T cells and epithelial progenitors. They tested the model's performance on both the virtual H&E images and real H&E data from patient samples.

Technical Explanation

The researchers used a supervised learning approach to classify cell types in H&E images, leveraging the detailed cell subclass labeling provided by MxIF histology. They first performed style transfer to synthesize virtual H&E images that matched the MxIF-labeled cell subclasses, creating a dataset of paired virtual H&E images and ground truth cell subclass labels.

They then trained a deep learning model to classify the cell types in the virtual H&E images, using the MxIF-derived labels as the target. On the virtual H&E data, the model was able to classify helper T cells and epithelial progenitors with positive predictive values of 0.34 ± 0.15 (prevalence 0.03 ± 0.01) and 0.47 ± 0.10 (prevalence 0.07 ± 0.02), respectively, when using ground truth centroid information.

When evaluating the model on real H&E data from patient samples, the researchers found that it could classify helper T cells and epithelial progenitors with upper bound positive predictive values of 0.43 ± 0.03 (parent class prevalence 0.21) and 0.94 ± 0.02 (parent class prevalence 0.49), again using ground truth centroid information.

This work represents the first demonstration of cell type classification for helper T cells and epithelial progenitors on H&E images, which are difficult to identify using traditional methods. The use of inter-modality learning, leveraging the detailed MxIF labeling, allowed the researchers to overcome the limitations of previous approaches and expand the set of cell types that can be reliably classified on H&E.

Critical Analysis

While this research represents an important step forward in cell type classification on H&E images, there are a few caveats and limitations to consider. First, the model's performance, while promising, is still far from perfect, especially for the rarer cell types. Further research and larger, more diverse datasets may be needed to improve the classification accuracy.

Additionally, the reliance on ground truth centroid information, which may not be readily available in many real-world scenarios, could limit the practical applicability of this approach. Future work should explore methods that can perform robust cell type classification without requiring such detailed positional information.

It's also worth noting that the synthetic virtual H&E images, while realistic, may not fully capture the nuances and variability of real-world tissue samples. Validating the model's performance on a broader range of real H&E data, including samples from diverse tissue types and disease states, would be an important next step.

Overall, this research demonstrates the potential of inter-modality learning to overcome the limitations of traditional cell type classification methods and open up new avenues for understanding the complex cellular interactions within the body. As the field of computational pathology continues to evolve, approaches like this one may help unlock valuable insights that can inform medical research and clinical decision-making.

Conclusion

This paper presents a novel approach to classifying different cell types in colon tissue samples using H&E staining. By leveraging the detailed cell subclass labeling provided by multiplexed immunofluorescence (MxIF) imaging and using inter-modality learning to synthesize realistic virtual H&E images, the researchers were able to train a deep learning model to identify hard-to-classify cell subtypes, such as helper T cells and epithelial progenitors.

While the model's performance still has room for improvement, this work represents an important step forward in automating cell type identification on H&E images, which could have significant implications for advancing our understanding of cellular interactions and informing medical research and clinical decision-making. As the field of computational pathology continues to evolve, techniques like this one may help unlock valuable insights that were previously out of reach.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

👀

Data-driven Nucleus Subclassification on Colon H&E using Style-transferred Digital Pathology

Lucas W. Remedios, Shunxing Bao, Samuel W. Remedios, Ho Hin Lee, Leon Y. Cai, Thomas Li, Ruining Deng, Nancy R. Newlin, Adam M. Saunders, Can Cui, Jia Li, Qi Liu, Ken S. Lau, Joseph T. Roland, Mary K Washington, Lori A. Coburn, Keith T. Wilson, Yuankai Huo, Bennett A. Landman

Understanding the way cells communicate, co-locate, and interrelate is essential to furthering our understanding of how the body functions. H&E is widely available, however, cell subtyping often requires expert knowledge and the use of specialized stains. To reduce the annotation burden, AI has been proposed for the classification of cells on H&E. For example, the recent Colon Nucleus Identification and Classification (CoNIC) Challenge focused on labeling 6 cell types on H&E of the colon. However, the CoNIC Challenge was unable to classify epithelial subtypes (progenitor, enteroendocrine, goblet), lymphocyte subtypes (B, helper T, cytotoxic T), and connective subtypes (fibroblasts). We use inter-modality learning to label previously un-labelable cell types on H&E. We take advantage of multiplexed immunofluorescence (MxIF) histology to label 14 cell subclasses. We performed style transfer on the same MxIF tissues to synthesize realistic virtual H&E which we paired with the MxIF-derived cell subclassification labels. We evaluated the efficacy of using a supervised learning scheme where the input was realistic-quality virtual H&E and the labels were MxIF-derived cell subclasses. We assessed our model on private virtual H&E and public real H&E. On virtual H&E, we were able to classify helper T cells and epithelial progenitors with positive predictive values of $0.34 pm 0.15$ (prevalence $0.03 pm 0.01$) and $0.47 pm 0.1$ (prevalence $0.07 pm 0.02$) respectively, when using ground truth centroid information. On real H&E we could classify helper T cells and epithelial progenitors with upper bound positive predictive values of $0.43 pm 0.03$ (parent class prevalence 0.21) and $0.94 pm 0.02$ (parent class prevalence 0.49) when using ground truth centroid information. This is the first work to provide cell type classification for helper T and epithelial progenitor nuclei on H&E.

7/9/2024

🤿

New!Deep learning-based classification of breast cancer molecular subtypes from H&E whole-slide images

Masoud Tafavvoghi, Anders Sildnes, Mehrdad Rakaee, Nikita Shvetsov, Lars Ailo Bongo, Lill-Tove Rasmussen Busund, Kajsa M{o}llersen

Classifying breast cancer molecular subtypes is crucial for tailoring treatment strategies. While immunohistochemistry (IHC) and gene expression profiling are standard methods for molecular subtyping, IHC can be subjective, and gene profiling is costly and not widely accessible in many regions. Previous approaches have highlighted the potential application of deep learning models on H&E-stained whole slide images (WSI) for molecular subtyping, but these efforts vary in their methods, datasets, and reported performance. In this work, we investigated whether H&E-stained WSIs could be solely leveraged to predict breast cancer molecular subtypes (luminal A, B, HER2-enriched, and Basal). We used 1,433 WSIs of breast cancer in a two-step pipeline: first, classifying tumor and non-tumor tiles to use only the tumor regions for molecular subtyping; and second, employing a One-vs-Rest (OvR) strategy to train four binary OvR classifiers and aggregating their results using an eXtreme Gradient Boosting (XGBoost) model. The pipeline was tested on 221 hold-out WSIs, achieving an overall macro F1 score of 0.95 for tumor detection and 0.73 for molecular subtyping. Our findings suggest that, with further validation, supervised deep learning models could serve as supportive tools for molecular subtyping in breast cancer. Our codes are made available to facilitate ongoing research and development.

9/17/2024

Deep Learning-based Prediction of Breast Cancer Tumor and Immune Phenotypes from Histopathology

Tiago Gonc{c}alves, Dagoberto Pulido-Arias, Julian Willett, Katharina V. Hoebel, Mason Cleveland, Syed Rakin Ahmed, Elizabeth Gerstner, Jayashree Kalpathy-Cramer, Jaime S. Cardoso, Christopher P. Bridge, Albert E. Kim

The interactions between tumor cells and the tumor microenvironment (TME) dictate therapeutic efficacy of radiation and many systemic therapies in breast cancer. However, to date, there is not a widely available method to reproducibly measure tumor and immune phenotypes for each patient's tumor. Given this unmet clinical need, we applied multiple instance learning (MIL) algorithms to assess activity of ten biologically relevant pathways from the hematoxylin and eosin (H&E) slide of primary breast tumors. We employed different feature extraction approaches and state-of-the-art model architectures. Using binary classification, our models attained area under the receiver operating characteristic (AUROC) scores above 0.70 for nearly all gene expression pathways and on some cases, exceeded 0.80. Attention maps suggest that our trained models recognize biologically relevant spatial patterns of cell sub-populations from H&E. These efforts represent a first step towards developing computational H&E biomarkers that reflect facets of the TME and hold promise for augmenting precision oncology.

4/26/2024

🤷

Immunohistochemistry guided segmentation of benign epithelial cells, in situ lesions, and invasive epithelial cells in breast cancer slides

Maren H{o}ib{o}, Andr'e Pedersen, Vibeke Grotnes Dale, Sissel Marie Berget, Borgny Ytterhus, Cecilia Lindskog, Elisabeth Wik, Lars A. Akslen, Ingerid Reinertsen, Erik Smistad, Marit Valla

Digital pathology enables automatic analysis of histopathological sections using artificial intelligence (AI). Automatic evaluation could improve diagnostic efficiency and help find associations between morphological features and clinical outcome. For development of such prediction models, identifying invasive epithelial cells, and separating these from benign epithelial cells and in situ lesions would be the first step. In this study, we aimed to develop an AI model for segmentation of epithelial cells in sections from breast cancer. We generated epithelial ground truth masks by restaining hematoxylin and eosin (HE) sections with cytokeratin (CK) AE1/AE3, and by pathologists' annotations. HE/CK image pairs were used to train a convolutional neural network, and data augmentation was used to make the model more robust. Tissue microarrays (TMAs) from 839 patients, and whole slide images from two patients were used for training and evaluation of the models. The sections were derived from four cohorts of breast cancer patients. TMAs from 21 patients from a fifth cohort was used as a second test set. In quantitative evaluation, a mean Dice score of 0.70, 0.79, and 0.75 for invasive epithelial cells, benign epithelial cells, and in situ lesions, respectively, were achieved. In qualitative scoring (0-5) by pathologists, results were best for all epithelium and invasive epithelium, with scores of 4.7 and 4.4. Scores for benign epithelium and in situ lesions were 3.7 and 2.0. The proposed model segmented epithelial cells in HE stained breast cancer slides well, but further work is needed for accurate division between the classes. Immunohistochemistry, together with pathologists' annotations, enabled the creation of accurate ground truths. The model is made freely available in FastPathology and the code is available at https://github.com/AICAN-Research/breast-epithelium-segmentation

6/17/2024