Elsevier

Pattern Recognition

Volume 65, May 2017, Pages 211-222
Pattern Recognition

Explaining nonlinear classification decisions with deep Taylor decomposition

https://doi.org/10.1016/j.patcog.2016.11.008Get rights and content
Under a Creative Commons license
open access

Highlights

  • A novel method to explain nonlinear classification decisions in terms of input variables is introduced.

  • The method is based on Taylor expansions and decomposes the output of a deep neural network in terms of input variables.

  • The resulting deep Taylor decomposition can be applied directly to existing neural networks without retraining.

  • The method is tested on two large-scale neural networks for image classification: BVLC CaffeNet and GoogleNet.

Abstract

Nonlinear methods such as Deep Neural Networks (DNNs) are the gold standard for various challenging machine learning problems such as image recognition. Although these methods perform impressively well, they have a significant disadvantage, the lack of transparency, limiting the interpretability of the solution and thus the scope of application in practice. Especially DNNs act as black boxes due to their multilayer nonlinear structure. In this paper we introduce a novel methodology for interpreting generic multilayer neural networks by decomposing the network classification decision into contributions of its input elements. Although our focus is on image classification, the method is applicable to a broad set of input data, learning tasks and network architectures. Our method called deep Taylor decomposition efficiently utilizes the structure of the network by backpropagating the explanations from the output to the input layer. We evaluate the proposed method empirically on the MNIST and ILSVRC data sets.

Keywords

Deep neural networks
Heatmapping
Taylor decomposition
Relevance propagation
Image recognition

Cited by (0)

Grégoire Montavon received a Masters degree in Communication Systems from École Polytechnique Fédérale de Lausanne, in 2009 and a Ph.D. degree in Machine Learning from the Technische Universität Berlin, in 2013. He is currently a Research Associate in the Machine Learning Group at TU Berlin.

Sebastian Lapuschkin received a Masters degree in Computer Science from Technische Universität Berlin, in 2013. He currently is a Research Associate in the Machine Learning Group at the Fraunhofer Heinrich-Hertz-Institute while pursuing is Ph.D. at TU Berlin. His research interests are computer vision, machine learning and data analysis.

Alexander Binder is Assistant Professor at the Singapore University of Technology and Design. He received a Ph.D. in Machine Learning from Technische Universität Berlin, in 2013. He participated in Pascal VOC and ImageCLEF competitions before. His research interests include neural networks, image analysis and medical imaging.

Wojciech Samek received a Diploma degree in Computer Science from Humboldt University Berlin in 2010 and the Ph.D. degree in Machine Learning from Technische Universität Berlin, in 2014. Currently, he directs the Machine Learning Group at Fraunhofer Heinrich Hertz Institute. His research interests include neural networks and signal processing.

Klaus-Robert Müller (Ph.D. 92) has been a Professor of computer science at TU Berlin since 2006; co-director Berlin Big Data Center. He won the 1999 Olympus Prize of German Pattern Recognition Society, the 2006 SEL Alcatel Communication Award, and the 2014 Science Prize of Berlin. Since 2012, he is an elected member of the German National Academy of Sciences – Leopoldina.