Pupil Segmentation in Cataract Surgery Videos

Our abstract/talk on iris and pupil segmentation in cataract surgery videos has been accepted for presentation at the ISBI 2020 conference.

Title: Pixel-Based Iris and Pupil Segmentation in Cataract Surgery Videos Using Mask R-CNN

Authors: Natalia Sokolova, Mario Taschwer, Klaus Schoeffmann

Abstract: Cataract surgery replaces the eye lens with an artificial one and is one of the most common surgical procedures performed worldwide. These surgeries can be recorded using a microscope camera and resulting videos stored for educational or documentary purposes as well as for automated post-operative analysis for detecting adverse events or complications. As pupil reactions (dilation or constriction) may lead to complications during surgery, automatic localization of pupil and iris in cataract surgery videos is a necessary preprocessing step for automated analysis. The problems of recognition, localization and tracking of eyes in medical images or videos have already been studied in the literature. However, none of these approaches used pixel-based segmentation, which would allow to localize pupil and iris in a sufficiently accurate way for further automated analysis. In this work, we investigate pixel-based pupil and iris segmentation by a region-based convolutional neural network (Mask R-CNN), which has not been applied to this problem before, to the best of our knowledge. We evaluate the performance of Mask R-CNN with different backbone networks for a manually annotated image dataset. Our method achieves at least 80% of Intersection over Union (IoU) for each iris example and at least 85% IoU for each pupil example in the test dataset.

Deblurring Cataract Surgery Videos

Our recent work on deblurring surgery videos has been accepted for publication at the ISBI 2020 conference.

Title: Deblurring Cataract Surgery Videos Using a Multi-Scale Deconvolutional Neural Network

Authors: Negin Ghamsarian, Klaus Schoeffmann, Mario Taschwer

Abstract: A common quality impairment observed in surgery videos is blur, caused by object motion or a defocused camera. Degraded image quality hampers the progress of machine-learning-based approaches in learning and recognizing semantic information in surgical video frames like instruments, phases, and surgical actions. This problem can be mitigated by automatically deblurring video frames as a preprocessing method for any subsequent video analysis task. In this paper, we propose and evaluate a multi-scale deconvolutional neural network to deblur cataract surgery videos. Experimental results confirm the effectiveness of the proposed approach in terms of the visual quality of frames as well as PSNR improvement.

MMM’20: Evaluating the Generalization Performance of Instrument Classification in Cataract Surgery Videos

Our paper has been accepted for publication at the MMM 2020 Conference on Multimedia Modeling. Authors: Natalia Sokolova, Klaus Schoeffmann, Mario Taschwer, Doris Putzgruber-Adamitsch, Yosuf El-Shabrawi Abstract: In the field of ophthalmic surgery, many clinicians nowadays record their microscopic procedures with a video camera and use the recorded footage for later purpose, such as forensics, teaching, or training. However, in order to efficiently use the video material after surgery, the video content needs to be analyzed automatically. Important semantic content to be analyzed and indexed in these short videos are operation instruments, since they provide an indication of the corresponding operation phase and surgical action. Related work has already shown that it is possible to accurately detect instruments in cataract surgery videos. However, their underlying dataset (from the CATARACTS challenge) has very good visual quality, which is not reflecting the typical quality of videos acquired in general hospitals. In this paper, we therefore analyze the generalization performance of deep learning models for instrument recognition in terms of dataset change. More precisely, we trained such models as ResNet-50, Inception v3 and NASNet Mobile using a dataset of high visual quality (CATARACT) and test it on another dataset with low visual quality (Cataract-101), and vice versa. Our results show that the generalizability is rather low in general, but clearly worse for the model trained on the high-quality dataset. Another important observation is the fact that the trained models are able to detect similar instruments in the other dataset even if their appearance is different.

URL: https://link.springer.com/chapter/10.1007/978-3-030-37734-2_51 


   author    = {Sokolova, Natalia and Schoeffmann, Klaus and Taschwer, Mario and Putzgruber-Adamitsch, Doris and El-Shabrawi, Yosuf},
   title     = {Evaluating the Generalization Performance of Instrument Classification in Cataract Surgery Videos},
   booktitle = {MultiMedia Modeling},
   year      = {2020},
   editor    = {Cheng, Wen-Huang and Kim, Junmo and Chu, Wei-Ta and Cui, Peng and Choi, Jung-Woo and Hu, Min-Chun and De Neve, Wesley},
   pages     = {626--636},
   address   = {Cham},
   publisher = {Springer International Publishing},
   doi       = {10.1007/978-3-030-37734-2_51},
   isbn      = {978-3-030-37734-2}

Second PhD student joined the project

In February 2019,  a second PhD student (Negin Ghamsarian) joined the OVID project. She will work on surgical workflow analysis and video preprocessing.

Since a third PhD student withdrew her application in a late stage of the hiring process, we decided to conduct the project with only two PhD students for now.

OVID project started on October 1, 2018

The OVID project officially started on October 1, 2018. The research project is supported by the Austrian Science Fund FWF and is scheduled to run for three years.

The FWF grant supports employment of three PhD students (for 3 years each) and one student assistant (for 15 months). Currently, one PhD student (Natalia Sokolova) and the student assistant (Markus Fox) have taken up their work. The other two PhD students will follow within the next few months.