Multi-omics Prediction from High-content Cellular Imaging with Deep Learning

Read original: arXiv:2306.09391 - Published 5/22/2024 by Rahil Mehrizi, Arash Mehrjou, Maryana Alegro, Yi Zhao, Benedetta Carbone, Carl Fishwick, Johanna Vappiani, Jing Bi, Siobhan Sanford, Hakan Keles and 3 others

🔮

Overview

This paper explores whether cell imaging data can be used to predict bulk multi-omics measurements, such as transcriptomics and proteomics, in a cell population.
The researchers developed a deep learning approach called Image2Omics that predicts multi-omics data directly from high-content cell images.
They evaluated Image2Omics on gene-edited macrophages derived from human induced pluripotent stem cells under different stimulation conditions.

Plain English Explanation

The researchers wanted to see if the visual information in cell images could be used to predict the molecular changes happening inside those cells. Cell imaging, transcriptomics (measuring gene activity), and proteomics (measuring protein levels) each provide different but complementary views of what's going on in cells. However, it wasn't clear whether cell imaging alone could be used to predict the multi-omics measurements.

To explore this, the researchers developed a deep learning approach called Image2Omics that takes cell images as input and tries to predict the transcriptomics and proteomics data for that cell population. They tested this on macrophages (a type of immune cell) that had been genetically engineered and stimulated in different ways.

The results showed that Image2Omics was able to significantly predict the levels of many transcripts (messenger RNAs) and proteins directly from the cell images, better than just using the average levels from the training data. This suggests that cell imaging may be able to serve as a more scalable and efficient way to get insights into the molecular state of cells, at least for certain applications. However, the predictability was still limited, so cell imaging may not be able to fully replace multi-omics measurements in all cases.

Technical Explanation

The researchers used a deep learning approach to develop Image2Omics, which takes high-content cell images as input and predicts bulk transcriptomics and proteomics measurements for the imaged cell population. They evaluated this on gene-edited macrophages derived from human induced pluripotent stem cells that were stimulated to take on an M1 or M2 phenotype.

Image2Omics was able to significantly outperform predictions based on the mean observed training set abundances. The researchers found that 18.72% (95% CI: 6.52%, 35.52%) and 13.38% (95% CI: 4.10%, 32.21%) of transcripts were predictable in M1 and M2 macrophages, respectively. For proteins, 8.46% (95% CI: 0.58%, 25.83%) and 13.98% (95% CI: 2.41%, 32.83%) were predictable in M1 and M2 macrophages.

These results demonstrate that some transcript and protein abundances are predictable from cell imaging data, and that cell imaging may potentially serve as a scalable and resource-efficient substitute for multi-omics measurements in certain applications and settings, depending on the desired performance threshold.

Critical Analysis

The paper provides a thorough experimental evaluation of the Image2Omics approach and offers promising results. However, the authors acknowledge that the predictability of multi-omics from cell imaging is still limited, with less than 20% of transcripts and less than 14% of proteins being predictable in their experiments.

Additionally, the researchers only tested their approach on a specific cell type (macrophages) under two stimulation conditions. It's unclear whether the results would generalize to other cell types, experimental conditions, or multi-omics measurements. Further research is needed to test the approach more broadly and understand the biological determinants that allow some multi-omics features to be more predictable from cell imaging than others.

The paper also does not discuss potential biases or confounding factors that could influence the predictability of multi-omics from cell imaging, such as imaging artifacts, cell heterogeneity, or the specific choice of fluorescent dyes used. Addressing these limitations could help refine the approach and improve its performance and reliability.

Conclusion

This research demonstrates the potential for using deep learning to predict bulk multi-omics measurements directly from high-content cell imaging data. While the predictability is still limited, the results suggest that cell imaging may be a scalable and resource-efficient complement or even substitute for multi-omics measurements in certain applications, depending on the specific needs and desired performance.

Further research is needed to better understand the biological underpinnings of this relationship and to expand the approach to additional cell types, experimental conditions, and multi-omics data. Addressing potential sources of bias and confounding factors could also help improve the reliability and generalizability of the Image2Omics approach.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🔮

Multi-omics Prediction from High-content Cellular Imaging with Deep Learning

Rahil Mehrizi, Arash Mehrjou, Maryana Alegro, Yi Zhao, Benedetta Carbone, Carl Fishwick, Johanna Vappiani, Jing Bi, Siobhan Sanford, Hakan Keles, Marcus Bantscheff, Cuong Nguyen, Patrick Schwab

High-content cellular imaging, transcriptomics, and proteomics data provide rich and complementary views on the molecular layers of biology that influence cellular states and function. However, the biological determinants through which changes in multi-omics measurements influence cellular morphology have not yet been systematically explored, and the degree to which cell imaging could potentially enable the prediction of multi-omics directly from cell imaging data is therefore currently unclear. Here, we address the question of whether it is possible to predict bulk multi-omics measurements directly from cell images using Image2Omics - a deep learning approach that predicts multi-omics in a cell population directly from high-content images of cells stained with multiplexed fluorescent dyes. We perform an experimental evaluation in gene-edited macrophages derived from human induced pluripotent stem cells (hiPSC) under multiple stimulation conditions and demonstrate that Image2Omics achieves significantly better performance in predicting transcriptomics and proteomics measurements directly from cell images than predictions based on the mean observed training set abundance. We observed significant predictability of abundances for 4927 (18.72%; 95% CI: 6.52%, 35.52%) and 3521 (13.38%; 95% CI: 4.10%, 32.21%) transcripts out of 26137 in M1 and M2-stimulated macrophages respectively and for 422 (8.46%; 95% CI: 0.58%, 25.83%) and 697 (13.98%; 95% CI: 2.41%, 32.83%) proteins out of 4986 in M1 and M2-stimulated macrophages respectively. Our results show that some transcript and protein abundances are predictable from cell imaging and that cell imaging may potentially, in some settings and depending on the mechanisms of interest and desired performance threshold, even be a scalable and resource-efficient substitute for multi-omics measurements.

5/22/2024

↗️

A Novel Generative Artificial Intelligence Method for Interference Study on Multiplex Brightfield Immunohistochemistry Images

Satarupa Mukherjee, Jim Martin, Yao Nie

Multiplex brightfield imaging offers the advantage of simultaneously analyzing multiple biomarkers on a single slide, as opposed to single biomarker labeling on multiple consecutive slides. To accurately analyze multiple biomarkers localized at the same cellular compartment, two representative biomarker sets were selected as assay models - cMET-PDL1-EGFR and CD8-LAG3-PDL1, where all three biomarkers can co-localize on the cell membrane. One of the most crucial preliminary stages for analyzing such assay is identifying each unique chromogen on individual cells. This is a challenging problem due to the co-localization of membrane stains from all the three biomarkers. It requires advanced color unmixing for creating the equivalent singleplex images from each triplex image for each biomarker. In this project, we developed a cycle-Generative Adversarial Network (cycle-GAN) method for unmixing the triplex images generated from the above-mentioned assays. Three different models were designed to generate the singleplex image for each of the three stains Tamra (purple), QM-Dabsyl (yellow) and Green. A notable novelty of our approach was that the input to the network were images in the optical density domain instead of conventionally used RGB images. The use of the optical density domain helped in reducing the blurriness of the synthetic singleplex images, which was often observed when the network was trained on RGB images. The cycle-GAN models were validated on 10,800 lung, gastric and colon images for the cMET-PDL1-EGFR assay and 3600 colon images for the CD8-LAG3-PDL1 assay. Visual as well as quantified assessments demonstrated that the proposed method is effective and efficient when compared with the manual reviewing results and is readily applicable to various multiplex assays.

8/16/2024

Deep Learning-based Prediction of Breast Cancer Tumor and Immune Phenotypes from Histopathology

Tiago Gonc{c}alves, Dagoberto Pulido-Arias, Julian Willett, Katharina V. Hoebel, Mason Cleveland, Syed Rakin Ahmed, Elizabeth Gerstner, Jayashree Kalpathy-Cramer, Jaime S. Cardoso, Christopher P. Bridge, Albert E. Kim

The interactions between tumor cells and the tumor microenvironment (TME) dictate therapeutic efficacy of radiation and many systemic therapies in breast cancer. However, to date, there is not a widely available method to reproducibly measure tumor and immune phenotypes for each patient's tumor. Given this unmet clinical need, we applied multiple instance learning (MIL) algorithms to assess activity of ten biologically relevant pathways from the hematoxylin and eosin (H&E) slide of primary breast tumors. We employed different feature extraction approaches and state-of-the-art model architectures. Using binary classification, our models attained area under the receiver operating characteristic (AUROC) scores above 0.70 for nearly all gene expression pathways and on some cases, exceeded 0.80. Attention maps suggest that our trained models recognize biologically relevant spatial patterns of cell sub-populations from H&E. These efforts represent a first step towards developing computational H&E biomarkers that reflect facets of the TME and hold promise for augmenting precision oncology.

4/26/2024

Histopathological Image Classification with Cell Morphology Aware Deep Neural Networks

Andrey Ignatov, Josephine Yates, Valentina Boeva

Histopathological images are widely used for the analysis of diseased (tumor) tissues and patient treatment selection. While the majority of microscopy image processing was previously done manually by pathologists, recent advances in computer vision allow for accurate recognition of lesion regions with deep learning-based solutions. Such models, however, usually require extensive annotated datasets for training, which is often not the case in the considered task, where the number of available patient data samples is very limited. To deal with this problem, we propose a novel DeepCMorph model pre-trained to learn cell morphology and identify a large number of different cancer types. The model consists of two modules: the first one performs cell nuclei segmentation and annotates each cell type, and is trained on a combination of 8 publicly available datasets to ensure its high generalizability and robustness. The second module combines the obtained segmentation map with the original microscopy image and is trained for the downstream task. We pre-trained this module on the Pan-Cancer TCGA dataset consisting of over 270K tissue patches extracted from 8736 diagnostic slides from 7175 patients. The proposed solution achieved a new state-of-the-art performance on the dataset under consideration, detecting 32 cancer types with over 82% accuracy and outperforming all previously proposed solutions by more than 4%. We demonstrate that the resulting pre-trained model can be easily fine-tuned on smaller microscopy datasets, yielding superior results compared to the current top solutions and models initialized with ImageNet weights. The codes and pre-trained models presented in this paper are available at: https://github.com/aiff22/DeepCMorph

7/12/2024