EgoSurgery-Tool: A Dataset of Surgical Tool and Hand Detection from Egocentric Open Surgery Videos

Read original: arXiv:2406.03095 - Published 6/7/2024 by Ryo Fujii, Hideo Saito, Hiroki Kajita
Total Score

0

EgoSurgery-Tool: A Dataset of Surgical Tool and Hand Detection from Egocentric Open Surgery Videos

Sign in to get full access

or

If you already have an account, we'll log you in

Overview

  • This paper presents the EgoSurgery-Tool dataset, which contains annotated surgical tool and hand detection data from egocentric open surgery videos.
  • The dataset is intended to support research on surgical workflow analysis, surgical skill assessment, and augmented reality-based surgical guidance systems.
  • The paper also describes the dataset creation process, including video acquisition, annotation, and challenges.

Plain English Explanation

The EgoSurgery-Tool dataset is a collection of videos recorded during open surgical procedures, where the camera is mounted on the surgeon's head. These videos capture the surgeon's perspective, showing their hands and the surgical tools they use throughout the operation.

The researchers who created this dataset have carefully annotated each frame of the videos, identifying the locations of the surgeon's hands and the various surgical tools being used. This annotated data can be used by other researchers to develop and test new computer vision algorithms for detecting and tracking surgical tools and hand movements.

Having this kind of dataset is important for advancing the field of surgical robotics and augmented reality-based surgical guidance systems. By understanding how surgeons interact with their tools and the surgical environment, researchers can create more intelligent and helpful technologies to assist during operations. This could lead to improvements in surgical outcomes, reduced errors, and better support for trainee surgeons.

Technical Explanation

The EgoSurgery-Tool dataset consists of 39 open surgery videos recorded from the surgeon's perspective, totaling over 8 hours of footage. The videos span a variety of surgical procedures, including general surgery, orthopedic surgery, and neurosurgery.

Each frame of the video has been manually annotated by human experts to identify the location of the surgeon's hands and the specific surgical tools being used. This annotation process resulted in over 1.6 million labeled instances of hands and tools across the dataset.

The researchers used a combination of computer vision techniques, including object detection and instance segmentation, to extract the relevant information from the video frames. They also faced several challenges, such as handling occlusions, varying lighting conditions, and the dynamic nature of the surgical environment.

The dataset is designed to support a range of research applications, including surgical workflow analysis, surgical skill assessment, surgical tool recognition, tool affordance and 6D pose estimation, and augmented reality-based surgical guidance.

Critical Analysis

The EgoSurgery-Tool dataset represents a valuable contribution to the field of surgical computer vision, as it provides a large-scale, annotated dataset of real-world surgical procedures. This data can be used to develop and evaluate new algorithms for tasks such as surgical tool detection and tracking, which are critical for enabling advanced surgical assistance technologies.

However, the dataset does have some limitations. The videos were recorded in a limited number of surgical settings, and the annotations may not fully capture the complexity of tool usage and hand movements during real-world operations. Additionally, the dataset does not include information about the surgical outcomes or the overall performance of the surgeons, which could be useful for assessing the potential impact of computer vision-based tools on surgical outcomes.

Despite these limitations, the EgoSurgery-Tool dataset is a significant step forward in providing the research community with the necessary data to drive progress in surgical computer vision. As the field continues to evolve, it will be important for researchers to build upon this work and explore new ways to leverage egocentric video data to enhance surgical care and support the training of future surgeons.

Conclusion

The EgoSurgery-Tool dataset represents an important contribution to the field of surgical computer vision, providing a large-scale, annotated dataset of egocentric open surgery videos. This data can be used to develop and evaluate new algorithms for tasks such as surgical tool detection and tracking, which are crucial for enabling advanced surgical assistance technologies.

While the dataset has some limitations, it serves as a valuable resource for researchers working to enhance surgical workflow analysis, skill assessment, and augmented reality-based guidance systems. As the field continues to evolve, the insights gained from this dataset have the potential to lead to significant improvements in surgical outcomes and the training of future surgeons.



This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

EgoSurgery-Tool: A Dataset of Surgical Tool and Hand Detection from Egocentric Open Surgery Videos
Total Score

0

EgoSurgery-Tool: A Dataset of Surgical Tool and Hand Detection from Egocentric Open Surgery Videos

Ryo Fujii, Hideo Saito, Hiroki Kajita

Surgical tool detection is a fundamental task for understanding egocentric open surgery videos. However, detecting surgical tools presents significant challenges due to their highly imbalanced class distribution, similar shapes and similar textures, and heavy occlusion. The lack of a comprehensive large-scale dataset compounds these challenges. In this paper, we introduce EgoSurgery-Tool, an extension of the existing EgoSurgery-Phase dataset, which contains real open surgery videos captured using an egocentric camera attached to the surgeon's head, along with phase annotations. EgoSurgery-Tool has been densely annotated with surgical tools and comprises over 49K surgical tool bounding boxes across 15 categories, constituting a large-scale surgical tool detection dataset. EgoSurgery-Tool also provides annotations for hand detection with over 46K hand-bounding boxes, capturing hand-object interactions that are crucial for understanding activities in egocentric open surgery. EgoSurgery-Tool is superior to existing datasets due to its larger scale, greater variety of surgical tools, more annotations, and denser scenes. We conduct a comprehensive analysis of EgoSurgery-Tool using nine popular object detectors to assess their effectiveness in both surgical tool and hand detection. The dataset will be released at https://github.com/Fujiry0/EgoSurgery.

Read more

6/7/2024

EgoSurgery-Phase: A Dataset of Surgical Phase Recognition from Egocentric Open Surgery Videos
Total Score

0

EgoSurgery-Phase: A Dataset of Surgical Phase Recognition from Egocentric Open Surgery Videos

Ryo Fujii, Masashi Hatano, Hideo Saito, Hiroki Kajita

Surgical phase recognition has gained significant attention due to its potential to offer solutions to numerous demands of the modern operating room. However, most existing methods concentrate on minimally invasive surgery (MIS), leaving surgical phase recognition for open surgery understudied. This discrepancy is primarily attributed to the scarcity of publicly available open surgery video datasets for surgical phase recognition. To address this issue, we introduce a new egocentric open surgery video dataset for phase recognition, named EgoSurgery-Phase. This dataset comprises 15 hours of real open surgery videos spanning 9 distinct surgical phases all captured using an egocentric camera attached to the surgeon's head. In addition to video, the EgoSurgery-Phase offers eye gaze. As far as we know, it is the first real open surgery video dataset for surgical phase recognition publicly available. Furthermore, inspired by the notable success of masked autoencoders (MAEs) in video understanding tasks (e.g., action recognition), we propose a gaze-guided masked autoencoder (GGMAE). Considering the regions where surgeons' gaze focuses are often critical for surgical phase recognition (e.g., surgical field), in our GGMAE, the gaze information acts as an empirical semantic richness prior to guiding the masking process, promoting better attention to semantically rich spatial regions. GGMAE significantly improves the previous state-of-the-art recognition method (6.4% in Jaccard) and the masked autoencoder-based method (3.1% in Jaccard) on EgoSurgery-Phase. The dataset will be released at https://github.com/Fujiry0/EgoSurgery.

Read more

5/31/2024

Monocular pose estimation of articulated surgical instruments in open surgery
Total Score

0

Monocular pose estimation of articulated surgical instruments in open surgery

Robert Spektor, Tom Friedman, Itay Or, Gil Bolotin, Shlomi Laufer

This work presents a novel approach to monocular 6D pose estimation of surgical instruments in open surgery, addressing challenges such as object articulations, symmetries, occlusions, and lack of annotated real-world data. The method leverages synthetic data generation and domain adaptation techniques to overcome these obstacles. The proposed approach consists of three main components: (1) synthetic data generation using 3D modeling of surgical tools with articulation rigging and physically-based rendering; (2) a tailored pose estimation framework combining object detection with pose estimation and a hybrid geometric fusion strategy; and (3) a training strategy that utilizes both synthetic and real unannotated data, employing domain adaptation on real video data using automatically generated pseudo-labels. Evaluations conducted on videos of open surgery demonstrate the good performance and real-world applicability of the proposed method, highlighting its potential for integration into medical augmented reality and robotic systems. The approach eliminates the need for extensive manual annotation of real surgical data.

Read more

7/18/2024

SurgiTrack: Fine-Grained Multi-Class Multi-Tool Tracking in Surgical Videos
Total Score

0

SurgiTrack: Fine-Grained Multi-Class Multi-Tool Tracking in Surgical Videos

Chinedu Innocent Nwoye, Nicolas Padoy

Accurate tool tracking is essential for the success of computer-assisted intervention. Previous efforts often modeled tool trajectories rigidly, overlooking the dynamic nature of surgical procedures, especially tracking scenarios like out-of-body and out-of-camera views. Addressing this limitation, the new CholecTrack20 dataset provides detailed labels that account for multiple tool trajectories in three perspectives: (1) intraoperative, (2) intracorporeal, and (3) visibility, representing the different types of temporal duration of tool tracks. These fine-grained labels enhance tracking flexibility but also increase the task complexity. Re-identifying tools after occlusion or re-insertion into the body remains challenging due to high visual similarity, especially among tools of the same category. This work recognizes the critical role of the tool operators in distinguishing tool track instances, especially those belonging to the same tool category. The operators' information are however not explicitly captured in surgical videos. We therefore propose SurgiTrack, a novel deep learning method that leverages YOLOv7 for precise tool detection and employs an attention mechanism to model the originating direction of the tools, as a proxy to their operators, for tool re-identification. To handle diverse tool trajectory perspectives, SurgiTrack employs a harmonizing bipartite matching graph, minimizing conflicts and ensuring accurate tool identity association. Experimental results on CholecTrack20 demonstrate SurgiTrack's effectiveness, outperforming baselines and state-of-the-art methods with real-time inference capability. This work sets a new standard in surgical tool tracking, providing dynamic trajectories for more adaptable and precise assistance in minimally invasive surgeries.

Read more

5/31/2024