Conception of a touchless human machine interaction system for operating rooms using deep learning

Florian Pereme; Jesus Zegarra Flores; Mirko Scavazzin; Franck Valentini; Jean-Pierre Radoux

doi:10.1117/12.2319141

24 May 2018 Conception of a touchless human machine interaction system for operating rooms using deep learning

Florian Pereme, Jesus Zegarra Flores, Mirko Scavazzin, Franck Valentini, Jean-Pierre Radoux

Proceedings Volume 10679, Optics, Photonics, and Digital Technologies for Imaging Applications V; 106790R (2018) https://doi.org/10.1117/12.2319141
Event: SPIE Photonics Europe, 2018, Strasbourg, France

Abstract

Touchless Human-Computer Interaction (HMI) is important in sterile environments, especially, in operating rooms (OR). Surgeons need to interact with images from scanners, rayon X, ultrasound images, etc. Problems about contamination may happen if surgeons must touch a keyboard or the mouse. To reduce the contamination and to give the possibility to the surgeon to be more autonomous during the operation, different projects have been developed in the Medic@ team from 2011. In order to recognize the hand and the gestures, two main projects: Gesture Tool Box and K2A; based on the use of the Kinect’s device (with a depth camera) have been prototyped. The detection of the hand gesture was done by segmentation and hand descriptors on RGB images, but always with a dependency on the depth camera (Kinect) to the detection of the hand. Additionally, this approach does not give the possibility that the system adapts to a new gesture demanded by the end-user, for example, if a new gesture is demanded, a new algorithm must be programed and tested. Thanks to the evolution of NVDIA cards to reduce time processing algorithms for CNN, the last approach explored was the use of the deep learning algorithms. The Gesture tool box project done was to analyze the hand gesture detection using a CNN (pre-trained in VGG 16) and transfer learning. The results were very promising showing 85% of accuracy for the detection of 10 different gestures form LSF ( French Sign Language) and also it was possible to create a user interface to give autonomy to the end user to add his own gesture and to do the transfer learning automatically. However, we still had some problems about the real time delay (0,8s) recognition and the dependency of the Kinect device. In this article, a new architecture is proposed, in which we want to use standard cameras and to reduce the real time delay of the hand and gesture detection. The state of the art shows the use of a YOLOv2 using Darknet framework as a good option with faster time recognition compared to other CNN. We have implemented YOLOv2 for the detection of the hand and signs with good results in gesture detection and with 0.10 seconds on gesture time recognition in laboratory conditions. Future work will include reducing the errors of our model, recognizing intuitive and standard gestures and doing tests in real conditions.

Conference Presentation

Citation Download Citation

Florian Pereme, Jesus Zegarra Flores, Mirko Scavazzin, Franck Valentini, and Jean-Pierre Radoux "Conception of a touchless human machine interaction system for operating rooms using deep learning", Proc. SPIE 10679, Optics, Photonics, and Digital Technologies for Imaging Applications V, 106790R (24 May 2018); https://doi.org/10.1117/12.2319141

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available