Holistically-Nested Edge Detection

Xie, Saining; Tu, Zhuowen

doi:10.1007/s11263-017-1004-z

Holistically-Nested Edge Detection

Published: 15 March 2017

Volume 125, pages 3–18, (2017)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Saining Xie¹ &
Zhuowen Tu¹

5757 Accesses
376 Citations
6 Altmetric
Explore all metrics

Abstract

We develop a new edge detection algorithm that addresses two important issues in this long-standing vision problem: (1) holistic image training and prediction; and (2) multi-scale and multi-level feature learning. Our proposed method, holistically-nested edge detection (HED), performs image-to-image prediction by means of a deep learning model that leverages fully convolutional neural networks and deeply-supervised nets. HED automatically learns rich hierarchical representations (guided by deep supervision on side responses) that are important in order to resolve the challenging ambiguity in edge and object boundary detection. We significantly advance the state-of-the-art on the BSDS500 dataset (ODS F-score of 0.790) and the NYU Depth dataset (ODS F-score of 0.746), and do so with an improved speed (0.4 s per image) that is orders of magnitude faster than some CNN-based edge detection algorithms developed before HED. We also observe encouraging results on other boundary detection benchmark datasets such as Multicue and PASCAL-Context.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

U-Net: Convolutional Networks for Biomedical Image Segmentation

Object detection using YOLO: challenges, architectural successors, datasets and applications

Article 08 August 2022

SSD: Single Shot MultiBox Detector

References

Arbelaez, P., Maire, M., Fowlkes, C., & Malik, J. (2011). Contour detection and hierarchical image segmentation. PAMI, 33(5), 898–916.
Article Google Scholar
Bertasius, G., Shi, J., & Torresani, L. (2015). Deepedge: A multi-scale bifurcated deep network for top-down contour detection. In CVPR.
Buyssens, P., Elmoataz, A., & Lézoray, O. (2013). Multiscale convolutional neural networks for vision-based classification of cells. In ACCV.
Canny, J. (1986). A computational approach to edge detection. PAMI, 6, 679–698.
Article Google Scholar
Chen, L. C., Barron, J. T., Papandreou, G., Murphy, K., & Yuille, A. L. (2016). Semantic image segmentation with task-specific edge detection using cnns and a discriminatively trained domain transform. In: CVPR.
Dollár, P., Tu, Z., & Belongie, S. (2006). Supervised learning of edges and object boundaries. In: CVPR.
Dollár, P., & Zitnick, C. L. (2015). Fast edge detection using structured forests. In PAMI.
Elder, J. H., & Goldberg, R. M. (2002). Ecological statistics of gestalt laws for the perceptual organization of contours. Journal of Vision, 2(4), 5.
Article Google Scholar
Everingham, M., Eslami, S. A., Van Gool, L., Williams, C. K., Winn, J., & Zisserman, A. (2014). The pascal visual object classes challenge: A retrospective. IJCV, 111(1), 98–136.
Article Google Scholar
Farabet, C., Couprie, C., Najman, L., & LeCun, Y. (2013). Learning hierarchical features for scene labeling. In PAMI.
Felzenszwalb, P. F., & Huttenlocher, D. P. (2004). Efficient graph-based image segmentation. IJCV, 59(2), 167–181.
Article Google Scholar
Ganin, Y., & Lempitsky, V. (2014). N4-fields: Neural network nearest neighbor fields for image transforms. In: ACCV.
Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In IEEE conference on computer vision and pattern recognition (CVPR), (pp. 580–587).
Gupta, S., Arbelaez, P., & Malik, J. (2013). Perceptual organization and recognition of indoor scenes from rgb-d images. In CVPR.
Gupta, S., Girshick, R., Arbeláez, P., & Malik. J. (2014). Learning rich features from rgb-d images for object detection and segmentation. In ECCV.
Hallman, S., & Fowlkes, C. C. (2015). Oriented edge forests for boundary detection. In: CVPR.
Hariharan, B., Arbeláez, P., Girshick, R., & Malik, J. (2015). Hypercolumns for object segmentation and fine-grained localization. In CVPR.
Hoiem, D., Efros, A. A., & Hebert, M. (2008). Putting objects in perspective. IJCV, 80(1), 3–15.
Article Google Scholar
Hoiem, D., Stein, A. N., Efros, A. A, & Hebert, M. (2007). Recovering occlusion boundaries from a single image. In ICCV.
Hou. X., Yuille, A., & Koch, C. (2013). Boundary detection benchmarking: Beyond f-measures. In CVPR.
Hubel, D. H., & Wiesel, T. N. (1962). Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. The Journal of physiology, 160(1), 106–154.
Article Google Scholar
Hwang, J. J., & Liu, T. L. (2015). Pixel-wise deep learning for contour detection. In ICLR.
Khoreva, A., Benenson, R., Omran, M., Hein, M., & Schiele, B. (2016). Weakly supervised object boundaries. In CVPR.
Kittler, J. (1983). On the accuracy of the sobel edge detector. Image and Vision Computing, 1(1), 37–42.
Article Google Scholar
Kivinen, J. J., Williams, C. K., Heess, N., & Technologies, D. (2014). Visual boundary prediction: A deep neural prediction network and quality dissection. In AISTATS.
Kokkinos, I. (2016). Pushing the boundaries of boundary detection using deep learning. In ICLR.
Konishi, S., Yuille, A. L., Coughlan, J. M., & Zhu, S. C. (2003). Statistical edge detection: Learning and evaluating edge cues. PAMI, 25(1), 57–74.
Article Google Scholar
LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R., Hubbard, W., & Jackel, L. (1989). Backpropagation applied to handwritten zip code recognition. In Neural Computation.
Lee, C. Y., Xie, S., Gallagher, P., Zhang, Z., & Tu, Z. (2015). Deeply-supervised nets. In AISTATS.
Li, Y., Paluri, M., Rehg, J. M., & Dollár, P. (2016). Unsupervised learning of edges. In CVPR.
Lim, J. J., Zitnick, C. L., & Dollár, P. (2013). Sketch tokens: A learned mid-level representation for contour and object detection. In CVPR.
Liu, C., Yuen, J., & Torralba, A. (2011). Nonparametric scene parsing via label transfer. PAMI, 33(12), 2368–2382.
Article Google Scholar
Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In CVPR.
Maninis, K. K., Pont-Tuset, J., Arbeláez, P., & Van Gool, L.(2016). Convolutional oriented boundaries. In ECCV.
Marr, D., & Hildreth, E. (1980). Theory of edge detection. Proceedings of the Royal Society of London Series B Biological Sciences, 207(1167), 187–217.
Article Google Scholar
Martin, D. R., Fowlkes, C. C., & Malik, J. (2004). Learning to detect natural image boundaries using local brightness, color, and texture cues. PAMI, 26(5), 530–549.
Article Google Scholar
Mély, D., Kim, J., McGill, M., Guo, Y., & Serre, T. (2015). A systematic comparison between visual cues for boundary detection. Vision Research, 120, 93–107.
Article Google Scholar
Merkow, J., Kriegman, D., Marsden, A., & Tu, Z. (2016). Dense volume-to-volume vascular boundary detection. In MICCAI.
Mottaghi, R., Chen, X., Liu, X., Cho, N. G., Lee, S. W., Fidler, S., Urtasun, R., & Yuille. A (2014). The role of context for object detection and semantic segmentation in the wild. In CVPR.
Neverova, N., Wolf, C., Taylor, G. W., & Nebout, F. (2014). Multi-scale deep learning for gesture detection and localization. In ECCV Workshops.
Premachandran, V., Bonev, B., & Yuille, A. L. (2015). Pascal boundaries: A class-agnostic semantic boundary dataset. arXiv preprint arXiv:1511.07951.
Ren, X. (2008). Multi-scale improves boundary detection in natural images. In ECCV.
Ren, X., & Bo, L. (2012). Discriminatively trained sparse code gradients for contour detection. In NIPS.
Ruderman, D. L., & Bialek, W. (1994). Statistics of natural images: Scaling in the woods. Physical Review Letters, 73(6), 814.
Article Google Scholar
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., et al. (2015). Imagenet large scale visual recognition challenge. IJCV, 115(3), 211–252.
Sermanet, P., Chintala, S., & LeCun, Y. (2012). Convolutional neural networks applied to house numbers digit classification. In ICPR.
Shen, W., Wang, X., Wang, Y., Bai, X., & Zhang, Z. (2015). Deepcontour: A deep convolutional feature learned by positive-sharing loss for contour detection draft version. In CVPR.
Shen, W., Zhao, K., Jiang, Y., Wang, Y., Zhang, Z., & Bai, X. (2016). Object skeleton extraction in natural images by fusing scale-associated deep side outputs. In CVPR.
Silberman, N., Hoiem, D., Kohli, P., & Fergus, R. (2012). Indoor segmentation and support inference from rgbd images. In ECCV.
Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In ICLR.
Torre, V., & Poggio, T. A. (1986). On edge detection. PAMI, 2, 147–163.
Article Google Scholar
Tu, Z. (2008). Auto-context and its application to high-level vision tasks. In CVPR.
Van Essen, D. C., & Gallant, J. L. (1994). Neural mechanisms of form and motion processing in the primate visual system. Neuron, 13(1), 1–10.
Article Google Scholar
Witkin, A. P. (1984). Scale-space filtering: A new approach to multi-scale description. In ICASSP.
Xie, S., & Tu, Z. (2015). Holistically-nested edge detection. In Proceedings of the IEEE international conference on computer vision, (pp. 1395–1403).
Yang, J., Price, B., Cohen, S., Lee, H., & Yang. M. H. (2016). Object contour detection with a fully convolutional encoder-decoder network. In CVPR.
Yuille, A. L., & Poggio, T. A. (1986). Scaling theorems for zero crossings. PAMI, 1, 15–25.
Article MATH Google Scholar
Zhu, Y., Tian, Y., Mexatas, D., & Dollár, P. (2015). Semantic amodal segmentation. arXiv preprint arXiv:1509.01329.

Download references

Acknowledgements

This work is supported by NSF NSF IIS-1618477IIS-1216528 (IIS-1360566), NSF award IIS-0844566 (IIS-1360568), NSF IIS-1618477, and a Northrop Grumman Contextual Robotics Grant. We thank Patrick Gallagher and Jameson Merkow for helping improve this manuscript. We also thank Piotr Dollár and Yin Li for insightful discussions. We are grateful for the generous donation of the GPUs by NVIDIA.

Author information

Authors and Affiliations

9500 Gilman Drive, La Jolla, CA, 92093-0515, USA
Saining Xie & Zhuowen Tu

Authors

Saining Xie
View author publications
You can also search for this author in PubMed Google Scholar
Zhuowen Tu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhuowen Tu.

Additional information

Communicated by K. Ikeuchi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xie, S., Tu, Z. Holistically-Nested Edge Detection. Int J Comput Vis 125, 3–18 (2017). https://doi.org/10.1007/s11263-017-1004-z

Download citation

Received: 15 June 2016
Accepted: 15 February 2017
Published: 15 March 2017
Issue Date: December 2017
DOI: https://doi.org/10.1007/s11263-017-1004-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Holistically-Nested Edge Detection

Abstract

Access this article

Similar content being viewed by others

U-Net: Convolutional Networks for Biomedical Image Segmentation

Object detection using YOLO: challenges, architectural successors, datasets and applications

SSD: Single Shot MultiBox Detector

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Holistically-Nested Edge Detection

Abstract

Access this article

Similar content being viewed by others

U-Net: Convolutional Networks for Biomedical Image Segmentation

Object detection using YOLO: challenges, architectural successors, datasets and applications

SSD: Single Shot MultiBox Detector

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation