Relevance Detection in Cataract Surgery Videos by Spatio-Temporal Action Localisation
This dataset contains the training and test data of the ICPR2020 paper mentioned below. In particular, it contains video segments from all cataract phases that were used to train the 1-vs-all models for the four relevant surgery phases below, which were evaluated with several different methods: (a) CNN, (b) CNN+LSTM, (c) CNN+GRU, (d) CNN+BiLSTM, (e) CNN+BiGRU.
- Phacoemulsification phase
- Irrigation/aspiration (with viscoelastic suction)
- Lens implantation
Negin Ghamsarian, Mario Taschwer, Doris Putzgruber, Stephanie Sarny, and Klaus Schoeffmann. 2020. Relevance Detection in Cataract Surgery Videos by Spatio-Temporal Action Localization. Proceedings of the 25th International Conference on Pattern Recognition (ICPR 2020). IEEE, Los Alamitos, CA, USA, 8 pages (to appear).
Relevance-Based Compression of Cataract Videos
This dataset contains annotations of action and idle content in frames from cataract videos. More particular, it contains annotations of temporal segments where no instruments are used (idle phases) as well as action content (these are spatial annotations of the eye and instruments in frames).
For the first set of annotations, 22 videos are selected from the released Cataract-101 dataset. All frames of 22 videos from the dataset are annotated and categorized as idle or action frame. From these annotations, 18 videos are randomly selected for training and the remaining videos are used for testing. Subsequently, 500 idle and 500 action frames are uniformly sampled from each video, composing 9000 frames per class in the training set and 2000 frames per class in the testing set.
The second set of annotations includes the manual annotations of the cornea and instruments using the open-source Supervisely platform. We have annotated the cornea of 262 frames from 11 cataract surgery videos for the eye segmentation task, and the instruments of 216 frames from the same videos for the instrument segmentation task.
The dataset has been released as an add-on to our ACM Multimedia 2020 paper:
N. Ghamsarian, H. Amirpour, C. Timmerrer, M. Taschwer, K. Schoeffmann. 2020. Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Neural Networks. In Proceedings of the ACM International Conference on Multimedia (ACMMM2020), pages 1-9. ACM, 2020 (to appear)
Tool Segmentation in Cataract Surgery Videos
This dataset contains bounding-box and mask segmentations for typical instruments in cataract surgery, sampled from 393 selected frames of the Cataract-101 video dataset as well as 4738 images of the CaDIS dataset. An evaluation for this dataset can be found in the following CBMS 2020 paper:
Markus Fox, Mario Taschwer, Klaus Schoeffmann. 2020. Pixel-Based Tool Segmentation in Cataract Surgery Videos with Mask R-CNN. Proceedings of the 33rd International Symposium on Computer Based Medical Systems (CBMS), IEEE, Los Alamitos, 4 pages.
Iris and Pupil Segmentation in Cataract Surgery Videos
This dataset contains mask segmentations for Iris and Pupil in 82 frames sampled from videos of the Cataract-101 video dataset. An evaluation for automatic Iris and Pupil segmentation can be found in our ISBI 2020 workshop paper:
Natalia Sokolova, Mario Taschwer, Stephanie Sarny, Doris Putzgruber-Adamitsch, Klaus Schoeffmann. 2020. Pixel-Based Iris and Pupil Segmentation in Cataract Surgery Videos Using Mask R-CNN. Proceedings in IEEE International Symposium on Biomedical Imaging Workshops. IEEE, Los Alamitos, CA, USA, 4 pages.
Cataract-101 Video Dataset
The ITEC Cataract-101 dataset consists of videos from 101 cataract surgeries, annotated with different operation phases that were performed by four different surgeons over a period of 9 months. These surgeons are grouped into moderately experienced and highly experienced surgeons (assistant vs. senior physicians), providing the basis for experience-based video analytics, as described in detail in the corresponding paper presented at MMSYS 2018 (if you use our dataset, please cite this paper) .
Cataract-21 Video Dataset
The dataset contains 21 video recordings of cataract surgeries. The dataset is divided into a training part consisting of 17 videos and a validation part consisting of 4 videos. For each video a CSV file with ground-truth annotations is provided, linking each frame number to one of ten classes (operation phases) listed above. The ground-truth annotation has been done by medical experts of Klinikum Klagenfurt. Please note that parts of video recordings that do not belong to any of the classes are labelled with “not_initialized” (in particular, the part before the first phase “Incision”).