Skip to main content
Log in

Holistically-Nested Edge Detection

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

We develop a new edge detection algorithm that addresses two important issues in this long-standing vision problem: (1) holistic image training and prediction; and (2) multi-scale and multi-level feature learning. Our proposed method, holistically-nested edge detection (HED), performs image-to-image prediction by means of a deep learning model that leverages fully convolutional neural networks and deeply-supervised nets. HED automatically learns rich hierarchical representations (guided by deep supervision on side responses) that are important in order to resolve the challenging ambiguity in edge and object boundary detection. We significantly advance the state-of-the-art on the BSDS500 dataset (ODS F-score of 0.790) and the NYU Depth dataset (ODS F-score of 0.746), and do so with an improved speed (0.4 s per image) that is orders of magnitude faster than some CNN-based edge detection algorithms developed before HED. We also observe encouraging results on other boundary detection benchmark datasets such as Multicue and PASCAL-Context.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Arbelaez, P., Maire, M., Fowlkes, C., & Malik, J. (2011). Contour detection and hierarchical image segmentation. PAMI, 33(5), 898–916.

    Article  Google Scholar 

  • Bertasius, G., Shi, J., & Torresani, L. (2015). Deepedge: A multi-scale bifurcated deep network for top-down contour detection. In CVPR.

  • Buyssens, P., Elmoataz, A., & Lézoray, O. (2013). Multiscale convolutional neural networks for vision-based classification of cells. In ACCV.

  • Canny, J. (1986). A computational approach to edge detection. PAMI, 6, 679–698.

    Article  Google Scholar 

  • Chen, L. C., Barron, J. T., Papandreou, G., Murphy, K., & Yuille, A. L. (2016). Semantic image segmentation with task-specific edge detection using cnns and a discriminatively trained domain transform. In: CVPR.

  • Dollár, P., Tu, Z., & Belongie, S. (2006). Supervised learning of edges and object boundaries. In: CVPR.

  • Dollár, P., & Zitnick, C. L. (2015). Fast edge detection using structured forests. In PAMI.

  • Elder, J. H., & Goldberg, R. M. (2002). Ecological statistics of gestalt laws for the perceptual organization of contours. Journal of Vision, 2(4), 5.

    Article  Google Scholar 

  • Everingham, M., Eslami, S. A., Van Gool, L., Williams, C. K., Winn, J., & Zisserman, A. (2014). The pascal visual object classes challenge: A retrospective. IJCV, 111(1), 98–136.

    Article  Google Scholar 

  • Farabet, C., Couprie, C., Najman, L., & LeCun, Y. (2013). Learning hierarchical features for scene labeling. In PAMI.

  • Felzenszwalb, P. F., & Huttenlocher, D. P. (2004). Efficient graph-based image segmentation. IJCV, 59(2), 167–181.

    Article  Google Scholar 

  • Ganin, Y., & Lempitsky, V. (2014). N4-fields: Neural network nearest neighbor fields for image transforms. In: ACCV.

  • Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In IEEE conference on computer vision and pattern recognition (CVPR), (pp. 580–587).

  • Gupta, S., Arbelaez, P., & Malik, J. (2013). Perceptual organization and recognition of indoor scenes from rgb-d images. In CVPR.

  • Gupta, S., Girshick, R., Arbeláez, P., & Malik. J. (2014). Learning rich features from rgb-d images for object detection and segmentation. In ECCV.

  • Hallman, S., & Fowlkes, C. C. (2015). Oriented edge forests for boundary detection. In: CVPR.

  • Hariharan, B., Arbeláez, P., Girshick, R., & Malik, J. (2015). Hypercolumns for object segmentation and fine-grained localization. In CVPR.

  • Hoiem, D., Efros, A. A., & Hebert, M. (2008). Putting objects in perspective. IJCV, 80(1), 3–15.

    Article  Google Scholar 

  • Hoiem, D., Stein, A. N., Efros, A. A, & Hebert, M. (2007). Recovering occlusion boundaries from a single image. In ICCV.

  • Hou. X., Yuille, A., & Koch, C. (2013). Boundary detection benchmarking: Beyond f-measures. In CVPR.

  • Hubel, D. H., & Wiesel, T. N. (1962). Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. The Journal of physiology, 160(1), 106–154.

    Article  Google Scholar 

  • Hwang, J. J., & Liu, T. L. (2015). Pixel-wise deep learning for contour detection. In ICLR.

  • Khoreva, A., Benenson, R., Omran, M., Hein, M., & Schiele, B. (2016). Weakly supervised object boundaries. In CVPR.

  • Kittler, J. (1983). On the accuracy of the sobel edge detector. Image and Vision Computing, 1(1), 37–42.

    Article  Google Scholar 

  • Kivinen, J. J., Williams, C. K., Heess, N., & Technologies, D. (2014). Visual boundary prediction: A deep neural prediction network and quality dissection. In AISTATS.

  • Kokkinos, I. (2016). Pushing the boundaries of boundary detection using deep learning. In ICLR.

  • Konishi, S., Yuille, A. L., Coughlan, J. M., & Zhu, S. C. (2003). Statistical edge detection: Learning and evaluating edge cues. PAMI, 25(1), 57–74.

    Article  Google Scholar 

  • LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R., Hubbard, W., & Jackel, L. (1989). Backpropagation applied to handwritten zip code recognition. In Neural Computation.

  • Lee, C. Y., Xie, S., Gallagher, P., Zhang, Z., & Tu, Z. (2015). Deeply-supervised nets. In AISTATS.

  • Li, Y., Paluri, M., Rehg, J. M., & Dollár, P. (2016). Unsupervised learning of edges. In CVPR.

  • Lim, J. J., Zitnick, C. L., & Dollár, P. (2013). Sketch tokens: A learned mid-level representation for contour and object detection. In CVPR.

  • Liu, C., Yuen, J., & Torralba, A. (2011). Nonparametric scene parsing via label transfer. PAMI, 33(12), 2368–2382.

    Article  Google Scholar 

  • Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In CVPR.

  • Maninis, K. K., Pont-Tuset, J., Arbeláez, P., & Van Gool, L.(2016). Convolutional oriented boundaries. In ECCV.

  • Marr, D., & Hildreth, E. (1980). Theory of edge detection. Proceedings of the Royal Society of London Series B Biological Sciences, 207(1167), 187–217.

    Article  Google Scholar 

  • Martin, D. R., Fowlkes, C. C., & Malik, J. (2004). Learning to detect natural image boundaries using local brightness, color, and texture cues. PAMI, 26(5), 530–549.

    Article  Google Scholar 

  • Mély, D., Kim, J., McGill, M., Guo, Y., & Serre, T. (2015). A systematic comparison between visual cues for boundary detection. Vision Research, 120, 93–107.

    Article  Google Scholar 

  • Merkow, J., Kriegman, D., Marsden, A., & Tu, Z. (2016). Dense volume-to-volume vascular boundary detection. In MICCAI.

  • Mottaghi, R., Chen, X., Liu, X., Cho, N. G., Lee, S. W., Fidler, S., Urtasun, R., & Yuille. A (2014). The role of context for object detection and semantic segmentation in the wild. In CVPR.

  • Neverova, N., Wolf, C., Taylor, G. W., & Nebout, F. (2014). Multi-scale deep learning for gesture detection and localization. In ECCV Workshops.

  • Premachandran, V., Bonev, B., & Yuille, A. L. (2015). Pascal boundaries: A class-agnostic semantic boundary dataset. arXiv preprint arXiv:1511.07951.

  • Ren, X. (2008). Multi-scale improves boundary detection in natural images. In ECCV.

  • Ren, X., & Bo, L. (2012). Discriminatively trained sparse code gradients for contour detection. In NIPS.

  • Ruderman, D. L., & Bialek, W. (1994). Statistics of natural images: Scaling in the woods. Physical Review Letters, 73(6), 814.

    Article  Google Scholar 

  • Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., et al. (2015). Imagenet large scale visual recognition challenge. IJCV, 115(3), 211–252.

  • Sermanet, P., Chintala, S., & LeCun, Y. (2012). Convolutional neural networks applied to house numbers digit classification. In ICPR.

  • Shen, W., Wang, X., Wang, Y., Bai, X., & Zhang, Z. (2015). Deepcontour: A deep convolutional feature learned by positive-sharing loss for contour detection draft version. In CVPR.

  • Shen, W., Zhao, K., Jiang, Y., Wang, Y., Zhang, Z., & Bai, X. (2016). Object skeleton extraction in natural images by fusing scale-associated deep side outputs. In CVPR.

  • Silberman, N., Hoiem, D., Kohli, P., & Fergus, R. (2012). Indoor segmentation and support inference from rgbd images. In ECCV.

  • Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In ICLR.

  • Torre, V., & Poggio, T. A. (1986). On edge detection. PAMI, 2, 147–163.

    Article  Google Scholar 

  • Tu, Z. (2008). Auto-context and its application to high-level vision tasks. In CVPR.

  • Van Essen, D. C., & Gallant, J. L. (1994). Neural mechanisms of form and motion processing in the primate visual system. Neuron, 13(1), 1–10.

    Article  Google Scholar 

  • Witkin, A. P. (1984). Scale-space filtering: A new approach to multi-scale description. In ICASSP.

  • Xie, S., & Tu, Z. (2015). Holistically-nested edge detection. In Proceedings of the IEEE international conference on computer vision, (pp. 1395–1403).

  • Yang, J., Price, B., Cohen, S., Lee, H., & Yang. M. H. (2016). Object contour detection with a fully convolutional encoder-decoder network. In CVPR.

  • Yuille, A. L., & Poggio, T. A. (1986). Scaling theorems for zero crossings. PAMI, 1, 15–25.

    Article  MATH  Google Scholar 

  • Zhu, Y., Tian, Y., Mexatas, D., & Dollár, P. (2015). Semantic amodal segmentation. arXiv preprint arXiv:1509.01329.

Download references

Acknowledgements

This work is supported by NSF NSF IIS-1618477IIS-1216528 (IIS-1360566), NSF award IIS-0844566 (IIS-1360568), NSF IIS-1618477, and a Northrop Grumman Contextual Robotics Grant. We thank Patrick Gallagher and Jameson Merkow for helping improve this manuscript. We also thank Piotr Dollár and Yin Li for insightful discussions. We are grateful for the generous donation of the GPUs by NVIDIA.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhuowen Tu.

Additional information

Communicated by K. Ikeuchi.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xie, S., Tu, Z. Holistically-Nested Edge Detection. Int J Comput Vis 125, 3–18 (2017). https://doi.org/10.1007/s11263-017-1004-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-017-1004-z

Keywords

Navigation