Skip to main content
Erschienen in: International Journal of Computer Assisted Radiology and Surgery 12/2022

14.06.2022 | Original Article

HMD-EgoPose: head-mounted display-based egocentric marker-less tool and hand pose estimation for augmented surgical guidance

verfasst von: Mitchell Doughty, Nilesh R. Ghugre

Erschienen in: International Journal of Computer Assisted Radiology and Surgery | Ausgabe 12/2022

Einloggen, um Zugang zu erhalten

Abstract

Purpose

The success or failure of modern computer-assisted surgery procedures hinges on the precise six-degree-of-freedom (6DoF) position and orientation (pose) estimation of tracked instruments and tissue. In this paper, we present HMD-EgoPose, a single-shot learning-based approach to hand and object pose estimation and demonstrate state-of-the-art performance on a benchmark dataset for monocular red-green-blue (RGB) 6DoF marker-less hand and surgical instrument pose tracking. Further, we reveal the capacity of our HMD-EgoPose framework for performant 6DoF pose estimation on a commercially available optical see-through head-mounted display (OST-HMD) through a low-latency streaming approach.

Methods

Our framework utilized an efficient convolutional neural network (CNN) backbone for multi-scale feature extraction and a set of subnetworks to jointly learn the 6DoF pose representation of the rigid surgical drill instrument and the grasping orientation of the hand of a user. To make our approach accessible to a commercially available OST-HMD, the Microsoft HoloLens 2, we created a pipeline for low-latency video and data communication with a high-performance computing workstation capable of optimized network inference.

Results

HMD-EgoPose outperformed current state-of-the-art approaches on a benchmark dataset for surgical tool pose estimation, achieving an average tool 3D vertex error of 11.0 mm on real data and furthering the progress towards a clinically viable marker-free tracking strategy. Through our low-latency streaming approach, we achieved a round trip latency of 199.1 ms for pose estimation and augmented visualization of the tracked model when integrated with the OST-HMD.

Conclusion

Our single-shot learned approach, which optimized 6DoF pose based on the joint interaction between the hand of a user and a rigid surgical drill, was robust to occlusion and complex surfaces and improved on current state-of-the-art approaches to marker-less tool and hand pose estimation. Further, we presented the feasibility of our approach for 6DoF object tracking on a commercially available OST-HMD.
Anhänge
Nur mit Berechtigung zugänglich
Literatur
1.
Zurück zum Zitat Navab N, Blum T, Wang L, Okur A, Wendler T (2012) First deployments of augmented reality in operating rooms. Computer 45(7):48–55CrossRef Navab N, Blum T, Wang L, Okur A, Wendler T (2012) First deployments of augmented reality in operating rooms. Computer 45(7):48–55CrossRef
2.
Zurück zum Zitat Sorriento A, Porfido MB, Mazzoleni S, Calvosa G, Tenucci M, Ciuti G, Dario P (2019) Optical and electromagnetic tracking systems for biomedical applications: a critical review on potentialities and limitations. IEEE Rev Biomed Eng 13:212–232CrossRefPubMed Sorriento A, Porfido MB, Mazzoleni S, Calvosa G, Tenucci M, Ciuti G, Dario P (2019) Optical and electromagnetic tracking systems for biomedical applications: a critical review on potentialities and limitations. IEEE Rev Biomed Eng 13:212–232CrossRefPubMed
3.
Zurück zum Zitat Doughty M, Ghugre NR (2022) Head-mounted display-based augmented reality for image-guided media delivery to the heart: a preliminary investigation of perceptual accuracy. J Imaging 8(2):33CrossRefPubMedPubMedCentral Doughty M, Ghugre NR (2022) Head-mounted display-based augmented reality for image-guided media delivery to the heart: a preliminary investigation of perceptual accuracy. J Imaging 8(2):33CrossRefPubMedPubMedCentral
4.
Zurück zum Zitat Müller F, Roner S, Liebmann F, Spirig JM, Fürnstahl P, Farshad M (2020) Augmented reality navigation for spinal pedicle screw instrumentation using intraoperative 3d imaging. Spine J 20(4):621–628CrossRefPubMed Müller F, Roner S, Liebmann F, Spirig JM, Fürnstahl P, Farshad M (2020) Augmented reality navigation for spinal pedicle screw instrumentation using intraoperative 3d imaging. Spine J 20(4):621–628CrossRefPubMed
5.
Zurück zum Zitat Doughty, M, Singh, K, Ghugre NR (2021) Surgeonassist-net: towards context-aware head-mounted display-based augmented reality for surgical guidance. In: International conference on medical image computing and computer-assisted intervention. Springer, Berlin, pp 667–677 Doughty, M, Singh, K, Ghugre NR (2021) Surgeonassist-net: towards context-aware head-mounted display-based augmented reality for surgical guidance. In: International conference on medical image computing and computer-assisted intervention. Springer, Berlin, pp 667–677
6.
Zurück zum Zitat Bernhardt S, Nicolau SA, Soler L, Doignon C (2017) The status of augmented reality in laparoscopic surgery as of 2016. Med Image Anal 37:66–90CrossRefPubMed Bernhardt S, Nicolau SA, Soler L, Doignon C (2017) The status of augmented reality in laparoscopic surgery as of 2016. Med Image Anal 37:66–90CrossRefPubMed
7.
Zurück zum Zitat Meola A, Cutolo F, Carbone M, Cagnazzo F, Ferrari M, Ferrari V (2017) Augmented reality in neurosurgery: a systematic review. Neurosurg Rev 40(4):537–548CrossRefPubMed Meola A, Cutolo F, Carbone M, Cagnazzo F, Ferrari M, Ferrari V (2017) Augmented reality in neurosurgery: a systematic review. Neurosurg Rev 40(4):537–548CrossRefPubMed
8.
Zurück zum Zitat Jud L, Fotouhi J, Andronic O, Aichmair A, Osgood G, Navab N, Farshad M (2020) Applicability of augmented reality in orthopedic surgery—a systematic review. BMC Musculoskelet Disord 21(1):1–13CrossRef Jud L, Fotouhi J, Andronic O, Aichmair A, Osgood G, Navab N, Farshad M (2020) Applicability of augmented reality in orthopedic surgery—a systematic review. BMC Musculoskelet Disord 21(1):1–13CrossRef
9.
Zurück zum Zitat Rahman R, Wood ME, Qian L, Price CL, Johnson AA, Osgood GM (2020) Head-mounted display use in surgery: a systematic review. Surg Innov 27(1):88–100CrossRefPubMed Rahman R, Wood ME, Qian L, Price CL, Johnson AA, Osgood GM (2020) Head-mounted display use in surgery: a systematic review. Surg Innov 27(1):88–100CrossRefPubMed
10.
Zurück zum Zitat Fitzpatrick JM (2010) The role of registration in accurate surgical guidance. Proc Inst Mech Eng Part H J Eng Med 224(5):607–622CrossRef Fitzpatrick JM (2010) The role of registration in accurate surgical guidance. Proc Inst Mech Eng Part H J Eng Med 224(5):607–622CrossRef
11.
Zurück zum Zitat Hinterstoisser, S, Lepetit, V, Ilic, S, Holzer, S, Bradski, G, Konolige, K, Navab N (2012) Model based training, detection and pose estimation of texture-less 3d objects in heavily cluttered scenes. In: Asian Conference on Computer Vision. Springer, Berlin, pp 548–562 Hinterstoisser, S, Lepetit, V, Ilic, S, Holzer, S, Bradski, G, Konolige, K, Navab N (2012) Model based training, detection and pose estimation of texture-less 3d objects in heavily cluttered scenes. In: Asian Conference on Computer Vision. Springer, Berlin, pp 548–562
12.
Zurück zum Zitat Drost, B, Ulrich, M, Navab, N, Ilic S (2010) Model globally, match locally: efficient and robust 3d object recognition. In: 2010 IEEE Computer society conference on computer vision and pattern recognition. IEEE, pp 998–1005 Drost, B, Ulrich, M, Navab, N, Ilic S (2010) Model globally, match locally: efficient and robust 3d object recognition. In: 2010 IEEE Computer society conference on computer vision and pattern recognition. IEEE, pp 998–1005
13.
Zurück zum Zitat Brachmann, E, Krull, A, Michel, F, Gumhold, S, Shotton, J, Rother C (2014) Learning 6d object pose estimation using 3d object coordinates. In: European conference on computer vision. Springer, Berlin, pp 536–551 Brachmann, E, Krull, A, Michel, F, Gumhold, S, Shotton, J, Rother C (2014) Learning 6d object pose estimation using 3d object coordinates. In: European conference on computer vision. Springer, Berlin, pp 536–551
14.
Zurück zum Zitat Sahin, C, Kim T-K (2018) Recovering 6d object pose: a review and multi-modal analysis. In: Proceedings of the European conference on computer vision (ECCV) workshops Sahin, C, Kim T-K (2018) Recovering 6d object pose: a review and multi-modal analysis. In: Proceedings of the European conference on computer vision (ECCV) workshops
15.
Zurück zum Zitat Tekin, B, Sinha, SN, Fua P (2018) Real-time seamless single shot 6d object pose prediction. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 292–301 Tekin, B, Sinha, SN, Fua P (2018) Real-time seamless single shot 6d object pose prediction. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 292–301
16.
Zurück zum Zitat Xiang, Y, Schmidt, T, Narayanan, V, Fox D (2018) PoseCNN: a convolutional neural network for 6d object pose estimation in cluttered scenes. In: Proceedings of robotics: science and systems Xiang, Y, Schmidt, T, Narayanan, V, Fox D (2018) PoseCNN: a convolutional neural network for 6d object pose estimation in cluttered scenes. In: Proceedings of robotics: science and systems
17.
Zurück zum Zitat Bukschat, Y, Vetter M (2020) Efficientpose: an efficient, accurate and scalable end-to-end 6d multi object pose estimation approach. arXiv preprint arXiv:2011.04307 Bukschat, Y, Vetter M (2020) Efficientpose: an efficient, accurate and scalable end-to-end 6d multi object pose estimation approach. arXiv preprint arXiv:​2011.​04307
18.
Zurück zum Zitat Peng, S, Liu, Y, Huang, Q, Zhou, X, Bao H (2019) Pvnet: pixel-wise voting network for 6DoF pose estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4561–4570 Peng, S, Liu, Y, Huang, Q, Zhou, X, Bao H (2019) Pvnet: pixel-wise voting network for 6DoF pose estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4561–4570
19.
Zurück zum Zitat Song, C, Song, J, Huang Q (2020) Hybridpose: 6d object pose estimation under hybrid representations. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 431–440 Song, C, Song, J, Huang Q (2020) Hybridpose: 6d object pose estimation under hybrid representations. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 431–440
20.
Zurück zum Zitat Rad, M, Lepetit V (2017) Bb8: a scalable, accurate, robust to partial occlusion method for predicting the 3d poses of challenging objects without using depth. In: Proceedings of the IEEE international conference on computer vision, pp 3828–3836 Rad, M, Lepetit V (2017) Bb8: a scalable, accurate, robust to partial occlusion method for predicting the 3d poses of challenging objects without using depth. In: Proceedings of the IEEE international conference on computer vision, pp 3828–3836
21.
Zurück zum Zitat Athitsos, V, Sclaroff S (2003) Estimating 3d hand pose from a cluttered image. In: 2003 IEEE computer society conference on computer vision and pattern recognition, 2003. Proceedings, vol 2. IEEE, p 432 Athitsos, V, Sclaroff S (2003) Estimating 3d hand pose from a cluttered image. In: 2003 IEEE computer society conference on computer vision and pattern recognition, 2003. Proceedings, vol 2. IEEE, p 432
22.
Zurück zum Zitat Cai, Y, Ge, L, Cai, J, Yuan J (2018) Weakly-supervised 3d hand pose estimation from monocular RGB images. In: Proceedings of the European conference on computer vision (ECCV), pp 666–682 Cai, Y, Ge, L, Cai, J, Yuan J (2018) Weakly-supervised 3d hand pose estimation from monocular RGB images. In: Proceedings of the European conference on computer vision (ECCV), pp 666–682
23.
Zurück zum Zitat Mueller, F, Bernard, F, Sotnychenko, O, Mehta, D, Sridhar, S, Casas, D, Theobalt C (2018) Ganerated hands for real-time 3d hand tracking from monocular RGB. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 49–59 Mueller, F, Bernard, F, Sotnychenko, O, Mehta, D, Sridhar, S, Casas, D, Theobalt C (2018) Ganerated hands for real-time 3d hand tracking from monocular RGB. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 49–59
24.
Zurück zum Zitat Romero, J, Tzionas, D, Black MJ (2017) Embodied hands: modeling and capturing hands and bodies together. ACM Trans Graph Romero, J, Tzionas, D, Black MJ (2017) Embodied hands: modeling and capturing hands and bodies together. ACM Trans Graph
25.
Zurück zum Zitat Hasson, Y, Varol, G, Tzionas, D, Kalevatykh, I, Black, MJ, Laptev, I, Schmid C (2019) Learning joint reconstruction of hands and manipulated objects. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11807–11816 Hasson, Y, Varol, G, Tzionas, D, Kalevatykh, I, Black, MJ, Laptev, I, Schmid C (2019) Learning joint reconstruction of hands and manipulated objects. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11807–11816
26.
Zurück zum Zitat Hasson, Y, Tekin, B, Bogo, F, Laptev, I, Pollefeys, M, Schmid C (2020) Leveraging photometric consistency over time for sparsely supervised hand-object reconstruction. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 571–580 Hasson, Y, Tekin, B, Bogo, F, Laptev, I, Pollefeys, M, Schmid C (2020) Leveraging photometric consistency over time for sparsely supervised hand-object reconstruction. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 571–580
27.
Zurück zum Zitat Hein J, Seibold M, Bogo F, Farshad M, Pollefeys M, Fürnstahl P, Navab N (2021) Towards markerless surgical tool and hand pose estimation. Int J Comput Assist Radiol Surg 16(5):799–808CrossRefPubMedPubMedCentral Hein J, Seibold M, Bogo F, Farshad M, Pollefeys M, Fürnstahl P, Navab N (2021) Towards markerless surgical tool and hand pose estimation. Int J Comput Assist Radiol Surg 16(5):799–808CrossRefPubMedPubMedCentral
28.
Zurück zum Zitat Tan, M, Pang, R, Le QV (2020) EfficientDet: scalable and efficient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10781–10790 Tan, M, Pang, R, Le QV (2020) EfficientDet: scalable and efficient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10781–10790
29.
Zurück zum Zitat Tan, M, Le Q (2019) EfficientNet: rethinking model scaling for convolutional neural networks. In: International conference on machine learning. PMLR, pp 6105–6114 Tan, M, Le Q (2019) EfficientNet: rethinking model scaling for convolutional neural networks. In: International conference on machine learning. PMLR, pp 6105–6114
30.
Zurück zum Zitat Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788 Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788
31.
Zurück zum Zitat Kingma, DP, Ba J (2015) Adam: a method for stochastic optimization. In: International conference for learning representations Kingma, DP, Ba J (2015) Adam: a method for stochastic optimization. In: International conference for learning representations
32.
Zurück zum Zitat Zhang Z (2000) A flexible new technique for camera calibration. IEEE Trans Pattern Anal Mach Intell 22(11):1330–1334CrossRef Zhang Z (2000) A flexible new technique for camera calibration. IEEE Trans Pattern Anal Mach Intell 22(11):1330–1334CrossRef
33.
Zurück zum Zitat Ronneberger, O, Fischer, P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention. Springer, Berlin, pp 234–241 Ronneberger, O, Fischer, P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention. Springer, Berlin, pp 234–241
34.
Zurück zum Zitat He, K, Zhang, X, Ren, S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778 He, K, Zhang, X, Ren, S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Metadaten
Titel
HMD-EgoPose: head-mounted display-based egocentric marker-less tool and hand pose estimation for augmented surgical guidance
verfasst von
Mitchell Doughty
Nilesh R. Ghugre
Publikationsdatum
14.06.2022
Verlag
Springer International Publishing
Erschienen in
International Journal of Computer Assisted Radiology and Surgery / Ausgabe 12/2022
Print ISSN: 1861-6410
Elektronische ISSN: 1861-6429
DOI
https://doi.org/10.1007/s11548-022-02688-y

Weitere Artikel der Ausgabe 12/2022

International Journal of Computer Assisted Radiology and Surgery 12/2022 Zur Ausgabe

„Übersichtlicher Wegweiser“: Lauterbachs umstrittener Klinik-Atlas ist online

17.05.2024 Klinik aktuell Nachrichten

Sie sei „ethisch geboten“, meint Gesundheitsminister Karl Lauterbach: mehr Transparenz über die Qualität von Klinikbehandlungen. Um sie abzubilden, lässt er gegen den Widerstand vieler Länder einen virtuellen Klinik-Atlas freischalten.

Klinikreform soll zehntausende Menschenleben retten

15.05.2024 Klinik aktuell Nachrichten

Gesundheitsminister Lauterbach hat die vom Bundeskabinett beschlossene Klinikreform verteidigt. Kritik an den Plänen kommt vom Marburger Bund. Und in den Ländern wird über den Gang zum Vermittlungsausschuss spekuliert.

Darf man die Behandlung eines Neonazis ablehnen?

08.05.2024 Gesellschaft Nachrichten

In einer Leseranfrage in der Zeitschrift Journal of the American Academy of Dermatology möchte ein anonymer Dermatologe bzw. eine anonyme Dermatologin wissen, ob er oder sie einen Patienten behandeln muss, der eine rassistische Tätowierung trägt.

Ein Drittel der jungen Ärztinnen und Ärzte erwägt abzuwandern

07.05.2024 Klinik aktuell Nachrichten

Extreme Arbeitsverdichtung und kaum Supervision: Dr. Andrea Martini, Sprecherin des Bündnisses Junge Ärztinnen und Ärzte (BJÄ) über den Frust des ärztlichen Nachwuchses und die Vorteile des Rucksack-Modells.

Update Radiologie

Bestellen Sie unseren Fach-Newsletter und bleiben Sie gut informiert.