Improving Quality Control of Whole Slide Images by Explicit Artifact Augmentation

Read original: arXiv:2406.11538 - Published 6/18/2024 by Artur Jurgas, Marek Wodzinski, Marina D'Amato, Jeroen van der Laak, Manfredo Atzori, Henning Muller

Improving Quality Control of Whole Slide Images by Explicit Artifact Augmentation

Overview

This research paper explores methods to improve quality control of whole slide images (WSIs) in computational pathology systems.
The authors propose an "explicit artifact augmentation" technique to train deep learning models to better detect and process common imaging artifacts.
The goal is to create more robust and reliable computational pathology systems that can handle the diverse range of image quality issues that can arise in real-world clinical settings.

Plain English Explanation

Medical professionals often use whole slide imaging (WSI) to digitally capture high-resolution images of tissue samples for analysis. However, these digital slides can suffer from various image quality issues or "artifacts" like blurriness, uneven lighting, or dust particles. These artifacts can interfere with the accuracy of automated pathology analysis systems.

The researchers in this paper developed a new technique to help computational pathology models learn to recognize and handle these common image artifacts. Rather than just training on "clean" images, they explicitly augmented the training data with realistic-looking artifacts. This helps the models become more robust and capable of handling the types of image quality issues they would encounter in real-world clinical settings.

The goal is to create more reliable and trustworthy computational pathology systems that can provide accurate diagnoses even in the face of imperfect input images. This could lead to improved patient outcomes and more efficient clinical workflows.

Technical Explanation

The key innovation in this paper is the use of "explicit artifact augmentation" to train deep learning models for computational pathology tasks. Rather than relying on the model to learn to handle image artifacts on its own, the researchers deliberately added realistic-looking artifacts to the training data.

This was done by first identifying common types of WSI artifacts, such as blurriness, uneven illumination, and the presence of dust/debris. They then developed algorithms to synthetically generate these artifacts and overlaid them onto clean training images. The augmented training set, containing a mix of clean and artifacted images, was then used to train the deep learning models.

The authors evaluated this approach using two main experiments. First, they trained a classification model to detect the presence of image artifacts, and showed that the explicitly augmented model significantly outperformed a baseline model trained only on clean images. Second, they integrated the artifact detection capabilities into an end-to-end computational pathology pipeline for breast cancer diagnosis, and demonstrated improved performance on real-world WSI data.

The key insight is that by deliberately exposing the models to a diverse range of realistic artifacts during training, they become better equipped to handle the noisy, imperfect input data encountered in clinical practice. This "inoculates" the models against the common pitfalls that can undermine the reliability of computational pathology systems.

Critical Analysis

The research presented in this paper makes a valuable contribution towards improving the robustness and trustworthiness of computational pathology systems. The authors' approach of explicitly augmenting the training data with common WSI artifacts is a clever and practical solution to a real-world challenge.

That said, the paper does not address some potential limitations and areas for further research. For example, the set of artifacts considered is still relatively limited, and it's unclear how well the approach would generalize to rarer or more complex types of image degradation. Additionally, the impact of artifact augmentation on model interpretability and explainability was not explored.

There are also open questions around how to best integrate artifact detection and handling capabilities into end-to-end computational pathology workflows. The paper demonstrates improved performance on a breast cancer diagnosis task, but further work is needed to understand the broader implications and potential pitfalls of this approach.

Overall, the research presented in this paper represents an important step forward, but additional work is needed to fully address the [challenges of enabling reliable, high-performance computational pathology systems in real-world clinical settings.

Conclusion

This research paper introduces a novel technique called "explicit artifact augmentation" to improve the quality control of whole slide images (WSIs) in computational pathology systems. By deliberately exposing deep learning models to a diverse range of realistic imaging artifacts during training, the authors have shown that these models can become more robust and reliable at handling the types of image quality issues encountered in real-world clinical practice.

The key benefit of this approach is that it helps create computational pathology systems that are less susceptible to being undermined by imperfect input data, leading to more trustworthy and accurate diagnoses. This could have significant implications for improving patient outcomes and streamlining clinical workflows in pathology.

While the research presented here is a valuable contribution, there are still open challenges and areas for further exploration. Nonetheless, the explicit artifact augmentation technique represents an important step forward in [enabling high-performance, trustworthy computational pathology systems that can thrive in the messy realities of modern healthcare.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Improving Quality Control of Whole Slide Images by Explicit Artifact Augmentation

Artur Jurgas, Marek Wodzinski, Marina D'Amato, Jeroen van der Laak, Manfredo Atzori, Henning Muller

The problem of artifacts in whole slide image acquisition, prevalent in both clinical workflows and research-oriented settings, necessitates human intervention and re-scanning. Overcoming this challenge requires developing quality control algorithms, that are hindered by the limited availability of relevant annotated data in histopathology. The manual annotation of ground-truth for artifact detection methods is expensive and time-consuming. This work addresses the issue by proposing a method dedicated to augmenting whole slide images with artifacts. The tool seamlessly generates and blends artifacts from an external library to a given histopathology dataset. The augmented datasets are then utilized to train artifact classification methods. The evaluation shows their usefulness in classification of the artifacts, where they show an improvement from 0.10 to 0.01 AUROC depending on the artifact type. The framework, model, weights, and ground-truth annotations are freely released to facilitate open science and reproducible research.

6/18/2024

Equipping Computational Pathology Systems with Artifact Processing Pipelines: A Showcase for Computation and Performance Trade-offs

Neel Kanwal, Farbod Khoraminia, Umay Kiraz, Andres Mosquera-Zamudio, Carlos Monteagudo, Emiel A. M. Janssen, Tahlita C. M. Zuiverloon, Chunmig Rong, Kjersti Engan

Histopathology is a gold standard for cancer diagnosis under a microscopic examination. However, histological tissue processing procedures result in artifacts, which are ultimately transferred to the digitized version of glass slides, known as whole slide images (WSIs). Artifacts are diagnostically irrelevant areas and may result in wrong deep learning (DL) algorithms predictions. Therefore, detecting and excluding artifacts in the computational pathology (CPATH) system is essential for reliable automated diagnosis. In this paper, we propose a mixture of experts (MoE) scheme for detecting five notable artifacts, including damaged tissue, blur, folded tissue, air bubbles, and histologically irrelevant blood from WSIs. First, we train independent binary DL models as experts to capture particular artifact morphology. Then, we ensemble their predictions using a fusion mechanism. We apply probabilistic thresholding over the final probability distribution to improve the sensitivity of the MoE. We developed DL pipelines using two MoEs and two multiclass models of state-of-the-art deep convolutional neural networks (DCNNs) and vision transformers (ViTs). DCNNs-based MoE and ViTs-based MoE schemes outperformed simpler multiclass models and were tested on datasets from different hospitals and cancer types, where MoE using DCNNs yielded the best results. The proposed MoE yields 86.15% F1 and 97.93% sensitivity scores on unseen data, retaining less computational cost for inference than MoE using ViTs. This best performance of MoEs comes with relatively higher computational trade-offs than multiclass models. The proposed artifact detection pipeline will not only ensure reliable CPATH predictions but may also provide quality control.

5/24/2024

Rethinking Histology Slide Digitization Workflows for Low-Resource Settings

Talat Zehra, Joseph Marino, Wendy Wang, Grigoriy Frantsuzov, Saad Nadeem

Histology slide digitization is becoming essential for telepathology (remote consultation), knowledge sharing (education), and using the state-of-the-art artificial intelligence algorithms (augmented/automated end-to-end clinical workflows). However, the cumulative costs of digital multi-slide high-speed brightfield scanners, cloud/on-premises storage, and personnel (IT and technicians) make the current slide digitization workflows out-of-reach for limited-resource settings, further widening the health equity gap; even single-slide manual scanning commercial solutions are costly due to hardware requirements (high-resolution cameras, high-spec PC/workstation, and support for only high-end microscopes). In this work, we present a new cloud slide digitization workflow for creating scanner-quality whole-slide images (WSIs) from uploaded low-quality videos, acquired from cheap and inexpensive microscopes with built-in cameras. Specifically, we present a pipeline to create stitched WSIs while automatically deblurring out-of-focus regions, upsampling input 10X images to 40X resolution, and reducing brightness/contrast and light-source illumination variations. We demonstrate the WSI creation efficacy from our workflow on World Health Organization-declared neglected tropical disease, Cutaneous Leishmaniasis (prevalent only in the poorest regions of the world and only diagnosed by sub-specialist dermatopathologists, rare in poor countries), as well as other common pathologies on core biopsies of breast, liver, duodenum, stomach and lymph node. The code and pretrained models will be accessible via our GitHub (https://github.com/nadeemlab/DeepLIIF), and the cloud platform will be available at https://deepliif.org for uploading microscope videos and downloading/viewing WSIs with shareable links (no sign-in required) for telepathology and knowledge sharing.

5/15/2024

WSI-VQA: Interpreting Whole Slide Images by Generative Visual Question Answering

Pingyi Chen, Chenglu Zhu, Sunyi Zheng, Honglin Li, Lin Yang

Whole slide imaging is routinely adopted for carcinoma diagnosis and prognosis. Abundant experience is required for pathologists to achieve accurate and reliable diagnostic results of whole slide images (WSI). The huge size and heterogeneous features of WSIs make the workflow of pathological reading extremely time-consuming. In this paper, we propose a novel framework (WSI-VQA) to interpret WSIs by generative visual question answering. WSI-VQA shows universality by reframing various kinds of slide-level tasks in a question-answering pattern, in which pathologists can achieve immunohistochemical grading, survival prediction, and tumor subtyping following human-machine interaction. Furthermore, we establish a WSI-VQA dataset which contains 8672 slide-level question-answering pairs with 977 WSIs. Besides the ability to deal with different slide-level tasks, our generative model which is named Wsi2Text Transformer (W2T) outperforms existing discriminative models in medical correctness, which reveals the potential of our model to be applied in the clinical scenario. Additionally, we also visualize the co-attention mapping between word embeddings and WSIs as an intuitive explanation for diagnostic results. The dataset and related code are available at https://github.com/cpystan/WSI-VQA.

7/9/2024