Enhancing Surgical Training with Self-Supervised Models

Jan 28, 2025, 5:18 PM

The field of robotic surgery is undergoing a transformation with the integration of advanced self-supervised learning models. This project focused on determining the skill levels of surgeons operating the Da Vinci Surgical System [dVSS] by leveraging data produced during surgeries. By analyzing video and kinematic data, the project aimed to enhance surgical training and execution without relying on manual data labeling.

Determining Surgeon Skill with Robotic Surgery Data

This initiative utilized self-supervised models to predict surgeon skill levels based on their interactions with surgical robots. By employing advanced machine learning architectures and leveraging datasets such as JIGSAWS [JHU-ISI Gesture and Skill Assessment Working Set], the project demonstrated how robotic surgery data could be harnessed to improve surgical outcomes and training methodologies.

Key Highlights:

Several critical aspects defined the success of this project:

Objective: The primary goal was to predict surgeon skill levels—categorized as novice, intermediate, and expert—using self-supervised models trained on robotic surgery data. This was achieved by leveraging video and kinematic data without the need for manual labeling.
Data Utilization: The JIGSAWS dataset, containing synchronized video and kinematic data, was used as the foundation. Optical flow techniques, such as OpenCV’s Lucas-Kanade method, were applied to highlight dynamic elements in surgical videos, capturing the nuances of surgeon movements.
Model Development: Two main architectures were implemented:
- Using a vision transformer [ViT] to encode optical flow and a fully connected network [FCN] to decode it into kinematic vectors.
- Leveraging LSTMs to encode both video and kinematic data, with binary classification to determine whether two data frames were the same.
Classification: The encoded embeddings were classified using an XGBoost classifier, achieving an accuracy of up to 87% for skill prediction.
Visualization: High-dimensional data representations were visualized using UMAP, clearly separating gestures performed by novice, intermediate, and expert surgeons.

Insights and Impact:

The use of self-supervised models demonstrated the ability to analyze surgeon skills without manual intervention, paving the way for continuous improvements in robotic surgery training.
Advanced techniques such as optical flow and dimensionality reduction methods highlighted key aspects of surgeon movements, offering deeper insights into their performance.
High classification accuracy indicated the potential for these models to become integral tools for evaluating and improving surgical proficiency.
Future possibilities include developing smart assistants to provide real-time feedback to surgeons, enhancing precision and outcomes in robotic surgeries.

This project showcased how self-supervised learning models could redefine skill assessment in robotic surgery, continuously evolving with each surgery performed. By bridging the gap between data analysis and practical applications, this work set a benchmark for future advancements in surgical robotics.

Enhancing Surgical Training with Self-Supervised Models