ColonMapper: topological mapping and localization for colonoscopy

Read original: arXiv:2305.05546 - Published 7/11/2024 by Javier Morlana, Juan D. Tard'os, J. M. M. Montiel

🌐

Overview

Proposes a topological mapping and localization system for real human colonoscopies
Addresses challenges of significant shape and illumination changes in the colon
Builds a map as a graph where each node represents a colon location using real images
Edges represent traversability between nodes
Uses deep global descriptors and a Bayesian filter for long-term place recognition

Plain English Explanation

The paper presents a ColonMapper system that can create a map of the human colon and then localize within that map, even when the colon's shape and lighting change significantly. This is important for procedures like colonoscopies, where the doctor needs to navigate through the colon.

The map is a graph, where each node represents a specific location in the colon, coded by a set of real images. The edges between nodes show how the colon can be traversed from one location to another. For images taken close together in time, where the scene hasn't changed much, the system can use recent transformer-based algorithms to match features and recognize places.

However, when dealing with longer-term changes, like different colonoscopies of the same patient, these feature-matching approaches fail. To address this, the researchers trained a deep neural network to learn a global descriptor that can recognize places even with significant changes. They also added a Bayesian filter to further improve the accuracy of long-term place recognition and relocalization within the map.

The experiments show that the ColonMapper system can autonomously build a map of the colon and then localize within that map, both during the same colonoscopy and across different colonoscopies of the same patient.

Technical Explanation

The ColonMapper system represents the colon as a graph, where each node is a location coded by a set of real images, and the edges represent traversability between locations. For short-term, close-in-time images where the scene changes are minor, the system can use recent transformer-based local feature matching algorithms to successfully recognize places.

However, the researchers found that these feature-based matching approaches fail under long-term changes, such as between different colonoscopies of the same patient. To address this, they trained a deep neural network to learn a global descriptor that can achieve high recall even with significant scene changes.

Additionally, the researchers incorporated a Bayesian filter to further boost the accuracy of long-term place recognition and enable relocalization within the previously built map. Their experiments demonstrate that the ColonMapper system can autonomously construct a map and then localize against it in two key use cases: within the same colonoscopy and across different colonoscopies of the same patient.

Critical Analysis

The paper presents a promising approach to topological mapping and localization in the challenging domain of human colonoscopies. The use of a global descriptor and Bayesian filtering to address long-term changes is a notable contribution, as feature-based matching alone is often insufficient in such scenarios.

However, the paper does not provide much detail on the specific neural network architecture or training process used for the global descriptor. Additionally, the experiments are limited to a relatively small dataset of real colonoscopies, and it would be valuable to see the system's performance evaluated on a larger and more diverse dataset.

Another potential limitation is the reliance on real image data, which may not be readily available in all settings. Exploring the system's performance on synthetic or augmented data could help expand its applicability.

It would also be interesting to see how the ColonMapper system compares to other state-of-the-art approaches in 3D reconstruction or SLAM for colonoscopy applications, as well as its potential integration with surgical guidance systems.

Overall, the ColonMapper system represents an important step forward in addressing the challenges of topological mapping and localization in the medical domain, and the researchers' focus on real-world applicability is commendable. Further exploration of the system's capabilities and limitations could yield valuable insights for the field.

Conclusion

The ColonMapper system proposed in this paper presents a novel approach to topological mapping and localization for real human colonoscopies. By addressing the significant challenges of shape and illumination changes, the system can build a map of the colon and then relocalize within that map, even across different colonoscopies of the same patient.

The use of deep global descriptors and Bayesian filtering to enhance long-term place recognition is a key contribution, overcoming the limitations of traditional feature-based matching. The experiments demonstrate the system's ability to autonomously construct and navigate a map, which could have important implications for improving the accuracy and efficiency of colonoscopy procedures.

While the paper leaves room for further exploration and refinement, the ColonMapper system represents a promising step forward in applying advanced mapping and localization techniques to the medical domain, with the potential to enhance patient care and outcomes.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🌐

ColonMapper: topological mapping and localization for colonoscopy

Javier Morlana, Juan D. Tard'os, J. M. M. Montiel

We propose a topological mapping and localization system able to operate on real human colonoscopies, despite significant shape and illumination changes. The map is a graph where each node codes a colon location by a set of real images, while edges represent traversability between nodes. For close-in-time images, where scene changes are minor, place recognition can be successfully managed with the recent transformers-based local feature matching algorithms. However, under long-term changes -- such as different colonoscopies of the same patient -- feature-based matching fails. To address this, we train on real colonoscopies a deep global descriptor achieving high recall with significant changes in the scene. The addition of a Bayesian filter boosts the accuracy of long-term place recognition, enabling relocalization in a previously built map. Our experiments show that ColonMapper is able to autonomously build a map and localize against it in two important use cases: localization within the same colonoscopy or within different colonoscopies of the same patient. Code: https://github.com/jmorlana/ColonMapper.

7/11/2024

CudaSIFT-SLAM: multiple-map visual SLAM for full procedure mapping in real human endoscopy

Richard Elvira, Juan D. Tard'os, Jos'e M. M. Montiel

Monocular visual simultaneous localization and mapping (V-SLAM) is nowadays an irreplaceable tool in mobile robotics and augmented reality, where it performs robustly. However, human colonoscopies pose formidable challenges like occlusions, blur, light changes, lack of texture, deformation, water jets or tool interaction, which result in very frequent tracking losses. ORB-SLAM3, the top performing multiple-map V-SLAM, is unable to recover from them by merging sub-maps or relocalizing the camera, due to the poor performance of its place recognition algorithm based on ORB features and DBoW2 bag-of-words. We present CudaSIFT-SLAM, the first V-SLAM system able to process complete human colonoscopies in real-time. To overcome the limitations of ORB-SLAM3, we use SIFT instead of ORB features and replace the DBoW2 direct index with the more computationally demanding brute-force matching, being able to successfully match images separated in time for relocation and map merging. Real-time performance is achieved thanks to CudaSIFT, a GPU implementation for SIFT extraction and brute-force matching. We benchmark our system in the C3VD phantom colon dataset, and in a full real colonoscopy from the Endomapper dataset, demonstrating the capabilities to merge sub-maps and relocate in them, obtaining significantly longer sub-maps. Our system successfully maps in real-time 88 % of the frames in the C3VD dataset. In a real screening colonoscopy, despite the much higher prevalence of occluded and blurred frames, the mapping coverage is 53 % in carefully explored areas and 38 % in the full sequence, a 70 % improvement over ORB-SLAM3.

5/28/2024

Structure-preserving Image Translation for Depth Estimation in Colonoscopy Video

Shuxian Wang, Akshay Paruchuri, Zhaoxi Zhang, Sarah McGill, Roni Sengupta

Monocular depth estimation in colonoscopy video aims to overcome the unusual lighting properties of the colonoscopic environment. One of the major challenges in this area is the domain gap between annotated but unrealistic synthetic data and unannotated but realistic clinical data. Previous attempts to bridge this domain gap directly target the depth estimation task itself. We propose a general pipeline of structure-preserving synthetic-to-real (sim2real) image translation (producing a modified version of the input image) to retain depth geometry through the translation process. This allows us to generate large quantities of realistic-looking synthetic images for supervised depth estimation with improved generalization to the clinical domain. We also propose a dataset of hand-picked sequences from clinical colonoscopies to improve the image translation process. We demonstrate the simultaneous realism of the translated images and preservation of depth maps via the performance of downstream depth estimation on various datasets.

8/20/2024

📉

SimCol3D -- 3D Reconstruction during Colonoscopy Challenge

Anita Rau, Sophia Bano, Yueming Jin, Pablo Azagra, Javier Morlana, Rawen Kader, Edward Sanderson, Bogdan J. Matuszewski, Jae Young Lee, Dong-Jae Lee, Erez Posner, Netanel Frank, Varshini Elangovan, Sista Raviteja, Zhengwen Li, Jiquan Liu, Seenivasan Lalithkumar, Mobarakol Islam, Hongliang Ren, Laurence B. Lovat, Jos'e M. M. Montiel, Danail Stoyanov

Colorectal cancer is one of the most common cancers in the world. While colonoscopy is an effective screening technique, navigating an endoscope through the colon to detect polyps is challenging. A 3D map of the observed surfaces could enhance the identification of unscreened colon tissue and serve as a training platform. However, reconstructing the colon from video footage remains difficult. Learning-based approaches hold promise as robust alternatives, but necessitate extensive datasets. Establishing a benchmark dataset, the 2022 EndoVis sub-challenge SimCol3D aimed to facilitate data-driven depth and pose prediction during colonoscopy. The challenge was hosted as part of MICCAI 2022 in Singapore. Six teams from around the world and representatives from academia and industry participated in the three sub-challenges: synthetic depth prediction, synthetic pose prediction, and real pose prediction. This paper describes the challenge, the submitted methods, and their results. We show that depth prediction from synthetic colonoscopy images is robustly solvable, while pose estimation remains an open research question.

7/4/2024