Real-Time Segmentation of Non-rigid Surgical Tools Based on Deep Learning and Tracking

García-Peraza-Herrera, Luis C.; Li, Wenqi; Gruijthuijsen, Caspar; Devreker, Alain; Attilakos, George; Deprest, Jan; Vander Poorten, Emmanuel; Stoyanov, Danail; Vercauteren, Tom; Ourselin, Sébastien

doi:10.1007/978-3-319-54057-3_8

Luis C. García-Peraza-Herrera²⁰,
Wenqi Li²⁰,
Caspar Gruijthuijsen²³,
Alain Devreker²³,
George Attilakos²²,
Jan Deprest²⁴,
Emmanuel Vander Poorten²³,
Danail Stoyanov²¹,
Tom Vercauteren²⁰ &
…
Sébastien Ourselin²⁰

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10170))

Included in the following conference series:

International Workshop on Computer-Assisted and Robotic Endoscopy

2415 Accesses
46 Citations
3 Altmetric

Abstract

Real-time tool segmentation is an essential component in computer-assisted surgical systems. We propose a novel real-time automatic method based on Fully Convolutional Networks (FCN) and optical flow tracking. Our method exploits the ability of deep neural networks to produce accurate segmentations of highly deformable parts along with the high speed of optical flow. Furthermore, the pre-trained FCN can be fine-tuned on a small amount of medical images without the need to hand-craft features. We validated our method using existing and new benchmark datasets, covering both ex vivo and in vivo real clinical cases where different surgical instruments are employed. Two versions of the method are presented, non-real-time and real-time. The former, using only deep learning, achieves a balanced accuracy of 89.6% on a real clinical dataset, outperforming the (non-real-time) state of the art by 3.8% points. The latter, a combination of deep learning with optical flow tracking, yields an average balanced accuracy of 78.2% across all the validated datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bouget, D., Benenson, R., Omran, M., Riffaud, L., Schiele, B., Jannin, P.: Detecting surgical tools by modelling local appearance and global shape. IEEE Trans. Med. Imaging 34(12), 2603–2617 (2015)
Article Google Scholar
Daga, P., Chadebecq, F., Shakir, D., Garcia-Peraza Herrera, L.C., Tella, M., Dwyer, G., David, A.L., Deprest, J., Stoyanov, D., Vercauteren, T., Ourselin, S.: Real-time mosaicing of fetoscopic videos using SIFT. In: SPIE Medical Imaging (2015)
Google Scholar
Sznitman, R., Ali, K., Richa, R., Taylor, R.H., Hager, G.D., Fua, P.: Data-driven visual tracking in retinal microsurgery. In: Ayache, N., Delingette, H., Golland, P., Mori, K. (eds.) MICCAI 2012. LNCS, vol. 7511, pp. 568–575. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33418-4_70
Chapter Google Scholar
Tella, M., Daga, P., Chadebecq, F., Thompson, S., Shakir, D., Dwyer, G., Wimalasundera, R., Deprest, J., Stoyanov, D., Vercauteren, T., Ourselin, S.: A combined EM and visual tracking probabilistic model for robust mosaicking of fetoscopic videos. In: IWBIR (2016)
Google Scholar
Devreker, A., Rosa, B., Desjardins, A., Alles, E., Garcia-Peraza, L., Maneas, E., Stoyanov, D., David, A., Vercauteren, T., Deprest, J., Ourselin, S., Reynaerts, D., Vander Poorten, E.: Fluidic actuation for intra-operative in situ imaging. In: IROS, pp. 1415–1421. IEEE (2015)
Google Scholar
Reiter, A., Allen, P.K., Zhao, T.: Marker-less articulated surgical tool detection. In: CARS (2012)
Google Scholar
Allan, M., Ourselin, S., Thompson, S., Hawkes, D.J., Kelly, J., Stoyanov, D.: Toward detection and localization of instruments in minimally invasive surgery. IEEE Trans. Biomed. Eng. 60(4), 1050–1058 (2013)
Article Google Scholar
Allan, M., Thompson, S., Clarkson, M.J., Ourselin, S., Hawkes, D.J., Kelly, J., Stoyanov, D.: 2D-3D pose tracking of rigid instruments in minimally invasive surgery. In: Stoyanov, D., Collins, D.L., Sakuma, I., Abolmaesumi, P., Jannin, P. (eds.) IPCAI 2014. LNCS, vol. 8498, pp. 1–10. Springer, Cham (2014). doi:10.1007/978-3-319-07521-1_1
Chapter Google Scholar
Pezzementi, Z., Voros, S., Hager, G.D.: Articulated object tracking by rendering consistent appearance parts. In: ICRA, pp. 3940–3947. IEEE (2009)
Google Scholar
Reiter, A., Goldman, R.E., Bajo, A., Iliopoulos, K., Simaan, N., Allen, P.K.: A learning algorithm for visual pose estimation of continuum robots. In: IROS, pp. 2390–2396. IEEE, September 2011
Google Scholar
Voros, S., Orvain, E., Cinquin, P., Long, J.A.: Automatic detection of instruments in laparoscopic images: a first step towards high level command of robotized endoscopic holders. In: The First IEEE/RAS-EMBS International Conference on Biomedical Robotics and Biomechatronics (BioRob 2006), pp. 1107–1112. IEEE (2006)
Google Scholar
Reiter, A., Allen, P.K., Zhao, T.: Appearance learning for 3D tracking of robotic surgical tools. Int. J. Robot. Res. 33(2), 342–356 (2014)
Article Google Scholar
Girshick, R.: Fast R-CNN. In: ICCV, pp. 1440–1448 (2015)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks, pp. 1–9 (2015)
Google Scholar
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL VOC Challenge 2007 Results. http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html
Twinanda, A.P., Shehata, S., Mutter, D., Marescaux, J., de Mathelin, M., Padoy, N.: EndoNet: a deep architecture for recognition tasks on laparoscopic videos. In: CVPR, pp. 1–10 (2016)
Google Scholar
Fetoscope: https://www.karlstorz.com/doc/interactivebrochure/3317862/html5
Noh, H., Hong, S., Han, B.: Learning deconvolution network for semantic segmentation. In: ICCV, pp. 1520–1528 (2015)
Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR, pp. 3431–3440. IEEE (2015)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS, pp. 1097–1105 (2012)
Google Scholar
Guerra, E., de Lara, J., Malizia, A., Díaz, P.: Supporting user-oriented analysis for multi-view domain-specific visual languages. Inf. Softw. Technol. 51(4), 769–784 (2009)
Article Google Scholar
Caffe Model Zoo. http://github.com/BVLC/caffe/wiki/Model-Zoo
Mottaghi, R., Chen, X., Liu, X., Cho, N.G., Lee, S.W., Fidler, S., Urtasun, R., Yuille, A.: The role of context for object detection and semantic segmentation in the wild. In: CVPR (2014)
Google Scholar
Smith, L.N.: No more pesky learning rate guessing games. Arxiv, June 2015
Google Scholar
Shi, J., Tomasi, C.: Good features to track. In: IEEE Computer Society Conference on CVPR, pp. 593–600 (1994)
Google Scholar
Bouguet, J.Y.: Pyramidal implementation of the lucas kanade feature tracker: description of the algorithm. Technical report, Intel Corporation Microprocessor Research Labs (2000)
Google Scholar
MICCAI. http://endovissub-instrument.grand-challenge.org
Kalal, Z., Mikolajczyk, K., Matas, J.: Tracking-learning-detection. IEEE Trans. Pattern Anal. Mach. Intell. 34(7), 1409–1422 (2012)
Article Google Scholar

Download references

Acknowledgements

This work was supported by Wellcome Trust [WT101957], EPSRC (NS/A000027/1, EP/H046410/1, EP/J020990/1, EP/K005278), NIHR BRC UCLH/UCL High Impact Initiative and a UCL EPSRC CDT Scholarship Award (EP/L016478/1). The authors would like to thank NVIDIA for the donated GeForce GTX TITAN X GPU, their colleagues E. Maneas, S. Moriconi, F. Chadebecq, M. Ebner and S. Nousias for the ground truth of FetalFlexTool and E. Maneas for preparing setup with an ex vivo placenta.

Author information

Authors and Affiliations

Translational Imaging Group, CMIC, University College London, London, UK
Luis C. García-Peraza-Herrera, Wenqi Li, Tom Vercauteren & Sébastien Ourselin
Surgical Robot Vision Group, CMIC, University College London, London, UK
Danail Stoyanov
University College London Hospitals, London, UK
George Attilakos
Katholieke Universiteit Leuven, Leuven, Belgium
Caspar Gruijthuijsen, Alain Devreker & Emmanuel Vander Poorten
Universitair Ziekenhuis Leuven, Leuven, Belgium
Jan Deprest

Authors

Luis C. García-Peraza-Herrera
View author publications
You can also search for this author in PubMed Google Scholar
Wenqi Li
View author publications
You can also search for this author in PubMed Google Scholar
Caspar Gruijthuijsen
View author publications
You can also search for this author in PubMed Google Scholar
Alain Devreker
View author publications
You can also search for this author in PubMed Google Scholar
George Attilakos
View author publications
You can also search for this author in PubMed Google Scholar
Jan Deprest
View author publications
You can also search for this author in PubMed Google Scholar
Emmanuel Vander Poorten
View author publications
You can also search for this author in PubMed Google Scholar
Danail Stoyanov
View author publications
You can also search for this author in PubMed Google Scholar
Tom Vercauteren
View author publications
You can also search for this author in PubMed Google Scholar
Sébastien Ourselin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Luis C. García-Peraza-Herrera .

Editor information

Editors and Affiliations

Robarts Research Institute, London, Ontario, Canada
Terry Peters
Imperial College London, London, United Kingdom
Guang-Zhong Yang
Johns Hopkins University, Baltimore, Maryland, USA
Nassir Navab
Graduate School of Information Science, Nagoya University, Nagoya, Japan
Kensaku Mori
Department of Computer Science, Xiamen University, Xiamen, China
Xiongbiao Luo
KUKA Robotics, Augsburg, Bayern, Germany
Tobias Reichl
Robarts Research Institute, Western University, London, Canada
Jonathan McLeod

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

440896_1_En_8_MOESM1_ESM.mp4

Supplementary material 1 (mp4 1369 KB)

440896_1_En_8_MOESM2_ESM.mp4

Supplementary material 2 (mp4 1221 KB)

440896_1_En_8_MOESM3_ESM.mp4

Supplementary material 3 (mp4 1118 KB)

Supplementary material 4 (mp4 3555 KB)

Supplementary material 1 (mp4 1369 KB)

Supplementary material 2 (mp4 1221 KB)

Supplementary material 3 (mp4 1118 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

García-Peraza-Herrera, L.C. et al. (2017). Real-Time Segmentation of Non-rigid Surgical Tools Based on Deep Learning and Tracking. In: Peters, T., et al. Computer-Assisted and Robotic Endoscopy. CARE 2016. Lecture Notes in Computer Science(), vol 10170. Springer, Cham. https://doi.org/10.1007/978-3-319-54057-3_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-54057-3_8
Published: 22 February 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-54056-6
Online ISBN: 978-3-319-54057-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Real-Time Segmentation of Non-rigid Surgical Tools Based on Deep Learning and Tracking

Abstract

Access this chapter

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

440896_1_En_8_MOESM1_ESM.mp4

440896_1_En_8_MOESM2_ESM.mp4

440896_1_En_8_MOESM3_ESM.mp4

Supplementary material 1 (mp4 1369 KB)

Supplementary material 2 (mp4 1221 KB)

Supplementary material 3 (mp4 1118 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation