Frame-Based Classification of Operation Phases in Cataract Surgery Videos

Primus, Manfred Jüergen; Putzgruber-Adamitsch, Doris; Taschwer, Mario; Münzer, Bernd; El-Shabrawi, Yosuf; Böszörmenyi, Laszlo; Schoeffmann, Klaus

doi:10.1007/978-3-319-73603-7_20

Manfred Jüergen Primus²¹,
Doris Putzgruber-Adamitsch²²,
Mario Taschwer²¹,
Bernd Münzer²¹,
Yosuf El-Shabrawi²²,
Laszlo Böszörmenyi²¹ &
…
Klaus Schoeffmann²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10704))

Included in the following conference series:

International Conference on Multimedia Modeling

3475 Accesses
15 Citations

Abstract

Cataract surgeries are frequently performed to correct a lens opacification of the human eye, which usually appears in the course of aging. These surgeries are conducted with the help of a microscope and are typically recorded on video for later inspection and educational purposes. However, post-hoc visual analysis of video recordings is cumbersome and time-consuming for surgeons if there is no navigation support, such as bookmarks to specific operation phases. To prepare the way for an automatic detection of operation phases in cataract surgery videos, we investigate the effectiveness of a deep convolutional neural network (CNN) to automatically assign video frames to operation phases, which can be regarded as a single-label multi-class classification problem. In absence of public datasets of cataract surgery videos, we provide a dataset of 21 videos of standardized cataract surgeries and use it to train and evaluate our CNN classifier. Experimental results display a mean F1-score of about 68% for frame-based operation phase classification, which can be further improved to 75% when considering temporal information of video frames in the CNN architecture.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
This decision is due to restrictions of the CAFFE framework, which does not easily allow adding inputs to fully connected layers of the CNN.

References

Charrière, K., Quellec, G., Lamard, M., Martiano, D., Cazuguel, G., Coatrieux, G., Cochener, B.: Real-time analysis of cataract surgery videos using statistical models. Multimed. Tools App. 76, 1–19 (2016)
Google Scholar
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 675–678. ACM (2014)
Google Scholar
Lalys, F., Riffaud, L., Bouget, D., Jannin, P.: A framework for the recognition of high-level surgical tasks from video images for cataract surgeries. IEEE Trans. Biomed. Eng. 59(4), 966–976 (2012)
Article Google Scholar
Petscharnig, S., Schöffmann, K.: Deep learning for shot classification in gynecologic surgery videos. In: Amsaleg, L., Guðmundsson, G.Þ., Gurrin, C., Jónsson, B.Þ., Satoh, S. (eds.) MMM 2017. LNCS, vol. 10132, pp. 702–713. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-51811-4_57
Chapter Google Scholar
Petscharnig, S., Schöffmann, K.: Learning laparoscopic video shot classification for gynecological surgery. Multimed. Tools App. 1–19 (2017)
Google Scholar
Primus, M.J., Schoeffmann, K., Böszörmenyi, L.: Instrument classification in laparoscopic videos. In: 2015 13th International Workshop on Content-Based Multimedia Indexing (CBMI), pp. 1–6. IEEE (2015)
Google Scholar
Primus, M.J., Schoeffmann, K., Böszörmenyi, L.: Temporal segmentation of laparoscopic videos into surgical phases. In: 2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI), pp. 1–6. IEEE (2016)
Google Scholar
Quellec, G., Lamard, M., Cochener, B., Cazuguel, G.: Real-time segmentation and recognition of surgical tasks in cataract surgery videos. IEEE Trans. Med. Imaging 33(12), 2352–2360 (2014)
Article Google Scholar
Speidel, S., Benzko, J., Krappe, S., Sudra, G., Azad, P., Peter, B.: Automatic classification of minimally invasive instruments based on endoscopic image sequences. In: SPIE Medical Imaging, pp. 72610A (2009)
Google Scholar
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Google Scholar
Twinanda, A.P., Shehata, S., Mutter, D., Marescaux, J., de Mathelin, M., Padoy, N.: Endonet: a deep architecture for recognition tasks on laparoscopic videos. IEEE Trans. Med. Imaging 36(1), 86–97 (2017)
Article Google Scholar

Download references

Acknowledgement

This work was supported by Universität Klagenfurt and Lakeside Labs GmbH, Klagenfurt, Austria and funding from the European Regional Development Fund and the Carinthian Economic Promotion Fund (KWF) under grant KWF-20214 U. 3520/26336/38165.

Author information

Authors and Affiliations

Alpen-Adria Universität Klagenfurt, Klagenfurt, Austria
Manfred Jüergen Primus, Mario Taschwer, Bernd Münzer, Laszlo Böszörmenyi & Klaus Schoeffmann
Klinikum Klagenfurt am Wörthersee, Klagenfurt, Austria
Doris Putzgruber-Adamitsch & Yosuf El-Shabrawi

Authors

Manfred Jüergen Primus
View author publications
You can also search for this author in PubMed Google Scholar
Doris Putzgruber-Adamitsch
View author publications
You can also search for this author in PubMed Google Scholar
Mario Taschwer
View author publications
You can also search for this author in PubMed Google Scholar
Bernd Münzer
View author publications
You can also search for this author in PubMed Google Scholar
Yosuf El-Shabrawi
View author publications
You can also search for this author in PubMed Google Scholar
Laszlo Böszörmenyi
View author publications
You can also search for this author in PubMed Google Scholar
Klaus Schoeffmann
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Manfred Jüergen Primus .

Editor information

Editors and Affiliations

Alpen-Adria-Universität Klagenfurt, Klagenfurt, Austria
Klaus Schoeffmann
Chulalongkorn University, Bangkok, Thailand
Thanarat H. Chalidabhongse
City University of Hong Kong, Hong Kong, China
Chong Wah Ngo
Chulalongkorn University, Bangkok, Thailand
Supavadee Aramvith
Dublin City University, Dublin, Ireland
Noel E. O’Connor
Gwangju Institute of Science and Technology, Gwangju, Korea (Republic of)
Yo-Sung Ho
Tampere University of Technology, Tampere, Finland
Moncef Gabbouj
Rutgers University, Piscataway, New Jersey, USA
Ahmed Elgammal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Primus, M.J. et al. (2018). Frame-Based Classification of Operation Phases in Cataract Surgery Videos. In: Schoeffmann, K., et al. MultiMedia Modeling. MMM 2018. Lecture Notes in Computer Science(), vol 10704. Springer, Cham. https://doi.org/10.1007/978-3-319-73603-7_20

Download citation

DOI: https://doi.org/10.1007/978-3-319-73603-7_20
Published: 13 January 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-73602-0
Online ISBN: 978-3-319-73603-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics