Thoracic Surgery Video Analysis for Surgical Phase Recognition

Read original: arXiv:2406.09185 - Published 6/14/2024 by Syed Abdul Mateen, Niharika Malvia, Syed Abdul Khader, Danny Wang, Deepti Srinivasan, Chi-Fu Jeffrey Yang, Lana Schumacher, Sandeep Manjanna

Thoracic Surgery Video Analysis for Surgical Phase Recognition

Overview

This research paper explores the use of video analysis techniques to recognize different surgical phases during thoracic surgery procedures.
The goal is to develop an automated system that can accurately identify the current phase of a surgical procedure in real-time, which could potentially assist surgeons and improve surgical workflow.
The paper presents a methodology for processing and analyzing surgical video data, as well as evaluating the performance of the proposed approach.

Plain English Explanation

During complex medical procedures like thoracic surgery, it's important for surgeons to have a clear understanding of the current stage or "phase" of the operation. This information can help them make better decisions, coordinate with their team more effectively, and ensure the surgery progresses smoothly.

The researchers in this study wanted to create a computer system that could automatically recognize the different phases of a thoracic surgery as it's happening. They did this by analyzing video recordings of past surgeries using advanced machine learning techniques.

The system they developed is able to watch a live video feed of a surgery and quickly identify which phase the procedure is currently in, such as the initial incision, the main surgery, or the closing of the incision. This real-time phase recognition could provide valuable information to the surgical team and potentially improve the overall quality and efficiency of the operation.

While this technology is still in the research stage, the goal is for it to eventually be integrated into the operating room to assist surgeons and enhance patient care. By automating the process of tracking the surgical workflow, this approach could help streamline procedures, reduce errors, and free up the surgical team to focus more on the critical aspects of the operation.

Technical Explanation

The researchers developed a methodology for Thoracic Surgery Video Analysis for Surgical Phase Recognition. They first collected a dataset of thoracic surgery video recordings, which they used to train and evaluate their phase recognition system.

The core of their approach involves using computer vision and video analysis techniques to identify visual cues and patterns associated with each surgical phase. This includes tracking the movements and interactions of surgical tools, as well as monitoring changes in the visual appearance of the surgical site over time.

The researchers experimented with different machine learning architectures and techniques, such as Vision Transformers, to optimize the accuracy and efficiency of their phase recognition system. Through extensive testing and evaluation, they were able to demonstrate the effectiveness of their approach in accurately identifying the current phase of a thoracic surgery procedure.

Critical Analysis

The researchers acknowledge several limitations and areas for further exploration in their work. One key challenge is the variability in surgical techniques and workflows, which can make it difficult to generalize the phase recognition system to a wide range of surgical procedures.

Additionally, the researchers note that the performance of the system may be influenced by factors such as video quality, camera placement, and environmental conditions in the operating room. Addressing these sources of noise and uncertainty could be an important area for future research.

While the results presented in the paper are promising, the researchers also emphasize the need for further validation and testing in real-world clinical settings. Integrating the phase recognition system into the surgical workflow and evaluating its impact on surgical outcomes and efficiency will be crucial next steps.

Overall, this research represents an important step towards the development of intelligent surgical assistance systems. By automating the tracking and analysis of surgical procedures, such technologies have the potential to enhance patient safety, improve surgical decision-making, and ultimately lead to better patient outcomes.

Conclusion

This research paper outlines a novel approach for Thoracic Surgery Video Analysis for Surgical Phase Recognition. The proposed system leverages computer vision and video analysis techniques to automatically identify the current phase of a thoracic surgery procedure in real-time.

The researchers have demonstrated the effectiveness of their phase recognition system through extensive testing and evaluation, highlighting its potential to assist surgeons and improve surgical workflows. While there are still some limitations and areas for further research, this work represents an important step towards the development of intelligent surgical assistance technologies.

As these systems continue to evolve and be integrated into real-world clinical settings, they could lead to enhanced patient safety, more efficient surgical procedures, and ultimately, better healthcare outcomes for patients undergoing complex medical interventions.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Thoracic Surgery Video Analysis for Surgical Phase Recognition

Syed Abdul Mateen, Niharika Malvia, Syed Abdul Khader, Danny Wang, Deepti Srinivasan, Chi-Fu Jeffrey Yang, Lana Schumacher, Sandeep Manjanna

This paper presents an approach for surgical phase recognition using video data, aiming to provide a comprehensive understanding of surgical procedures for automated workflow analysis. The advent of robotic surgery, digitized operating rooms, and the generation of vast amounts of data have opened doors for the application of machine learning and computer vision in the analysis of surgical videos. Among these advancements, Surgical Phase Recognition(SPR) stands out as an emerging technology that has the potential to recognize and assess the ongoing surgical scenario, summarize the surgery, evaluate surgical skills, offer surgical decision support, and facilitate medical training. In this paper, we analyse and evaluate both frame-based and video clipping-based phase recognition on thoracic surgery dataset consisting of 11 classes of phases. Specifically, we utilize ImageNet ViT for image-based classification and VideoMAE as the baseline model for video-based classification. We show that Masked Video Distillation(MVD) exhibits superior performance, achieving a top-1 accuracy of 72.9%, compared to 52.31% achieved by ImageNet ViT. These findings underscore the efficacy of video-based classifiers over their image-based counterparts in surgical phase recognition tasks.

6/14/2024

Robust Surgical Phase Recognition From Annotation Efficient Supervision

Or Rubin, Shlomi Laufer

Surgical phase recognition is a key task in computer-assisted surgery, aiming to automatically identify and categorize the different phases within a surgical procedure. Despite substantial advancements, most current approaches rely on fully supervised training, requiring expensive and time-consuming frame-level annotations. Timestamp supervision has recently emerged as a promising alternative, significantly reducing annotation costs while maintaining competitive performance. However, models trained on timestamp annotations can be negatively impacted by missing phase annotations, leading to a potential drawback in real-world scenarios. In this work, we address this issue by proposing a robust method for surgical phase recognition that can handle missing phase annotations effectively. Furthermore, we introduce the SkipTag@K annotation approach to the surgical domain, enabling a flexible balance between annotation effort and model performance. Our method achieves competitive results on two challenging datasets, demonstrating its efficacy in handling missing phase annotations and its potential for reducing annotation costs. Specifically, we achieve an accuracy of 85.1% on the MultiBypass140 dataset using only 3 annotated frames per video, showcasing the effectiveness of our method and the potential of the SkipTag@K setup. We perform extensive experiments to validate the robustness of our method and provide valuable insights to guide future research in surgical phase recognition. Our work contributes to the advancement of surgical workflow recognition and paves the way for more efficient and reliable surgical phase recognition systems.

6/27/2024

EgoSurgery-Phase: A Dataset of Surgical Phase Recognition from Egocentric Open Surgery Videos

Ryo Fujii, Masashi Hatano, Hideo Saito, Hiroki Kajita

Surgical phase recognition has gained significant attention due to its potential to offer solutions to numerous demands of the modern operating room. However, most existing methods concentrate on minimally invasive surgery (MIS), leaving surgical phase recognition for open surgery understudied. This discrepancy is primarily attributed to the scarcity of publicly available open surgery video datasets for surgical phase recognition. To address this issue, we introduce a new egocentric open surgery video dataset for phase recognition, named EgoSurgery-Phase. This dataset comprises 15 hours of real open surgery videos spanning 9 distinct surgical phases all captured using an egocentric camera attached to the surgeon's head. In addition to video, the EgoSurgery-Phase offers eye gaze. As far as we know, it is the first real open surgery video dataset for surgical phase recognition publicly available. Furthermore, inspired by the notable success of masked autoencoders (MAEs) in video understanding tasks (e.g., action recognition), we propose a gaze-guided masked autoencoder (GGMAE). Considering the regions where surgeons' gaze focuses are often critical for surgical phase recognition (e.g., surgical field), in our GGMAE, the gaze information acts as an empirical semantic richness prior to guiding the masking process, promoting better attention to semantically rich spatial regions. GGMAE significantly improves the previous state-of-the-art recognition method (6.4% in Jaccard) and the masked autoencoder-based method (3.1% in Jaccard) on EgoSurgery-Phase. The dataset will be released at https://github.com/Fujiry0/EgoSurgery.

5/31/2024

MuST: Multi-Scale Transformers for Surgical Phase Recognition

Alejandra P'erez, Santiago Rodr'iguez, Nicol'as Ayobi, Nicol'as Aparicio, Eug'enie Dessevres, Pablo Arbel'aez

Phase recognition in surgical videos is crucial for enhancing computer-aided surgical systems as it enables automated understanding of sequential procedural stages. Existing methods often rely on fixed temporal windows for video analysis to identify dynamic surgical phases. Thus, they struggle to simultaneously capture short-, mid-, and long-term information necessary to fully understand complex surgical procedures. To address these issues, we propose Multi-Scale Transformers for Surgical Phase Recognition (MuST), a novel Transformer-based approach that combines a Multi-Term Frame encoder with a Temporal Consistency Module to capture information across multiple temporal scales of a surgical video. Our Multi-Term Frame Encoder computes interdependencies across a hierarchy of temporal scales by sampling sequences at increasing strides around the frame of interest. Furthermore, we employ a long-term Transformer encoder over the frame embeddings to further enhance long-term reasoning. MuST achieves higher performance than previous state-of-the-art methods on three different public benchmarks.

7/25/2024