skip to main content
research-article

Globally and locally consistent image completion

Published:20 July 2017Publication History
Skip Abstract Section

Abstract

We present a novel approach for image completion that results in images that are both locally and globally consistent. With a fully-convolutional neural network, we can complete images of arbitrary resolutions by filling-in missing regions of any shape. To train this image completion network to be consistent, we use global and local context discriminators that are trained to distinguish real images from completed ones. The global discriminator looks at the entire image to assess if it is coherent as a whole, while the local discriminator looks only at a small area centered at the completed region to ensure the local consistency of the generated patches. The image completion network is then trained to fool the both context discriminator networks, which requires it to generate images that are indistinguishable from real ones with regard to overall consistency as well as in details. We show that our approach can be used to complete a wide variety of scenes. Furthermore, in contrast with the patch-based approaches such as PatchMatch, our approach can generate fragments that do not appear elsewhere in the image, which allows us to naturally complete the images of objects with familiar and highly specific structures, such as faces.

Skip Supplemental Material Section

Supplemental Material

papers-0331.mp4

mp4

456.8 MB

References

  1. Coloma Ballester, Marcelo Bertalmío, Vicent Caselles, Guillermo Sapiro, and Joan Verdera. 2001. Filling-in by joint interpolation of vector fields and gray levels. IEEE Transactions on Image Processing 10, 8 (2001), 1200--1211. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Connelly Barnes, Eli Shechtman, Adam Finkelstein, and Dan B Goldman. 2009. Patch-Match: A Randomized Correspondence Algorithm for Structural Image Editing. ACM Transactions on Graphics (Proceedings of SIGGRAPH) 28, 3 (2009), 24:1--24:11.Google ScholarGoogle Scholar
  3. Connelly Barnes, Eli Shechtman, Dan B. Goldman, and Adam Finkelstein. 2010. The Generalized Patchmatch Correspondence Algorithm. In European Conference on Computer Vision. 29--43. Google ScholarGoogle ScholarCross RefCross Ref
  4. Marcelo Bertalmio, Guillermo Sapiro, Vincent Caselles, and Coloma Ballester. 2000. Image Inpainting. In ACM Transactions on Graphics (Proceedings of SIGGRAPH). 417--424. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. Bertalmio, L. Vese, G. Sapiro, and S. Osher. 2003. Simultaneous structure and texture image inpainting. IEEE Transactions on Image Processing 12, 8 (2003), 882--889. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. A. Criminisi, P. Perez, and K. Toyama. 2004. Region Filling and Object Removal by Exemplar-based Image Inpainting. IEEE Transactions on Image Processing 13, 9 (2004), 1200--1212. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Soheil Darabi, Eli Shechtman, Connelly Barnes, Dan B Goldman, and Pradeep Sen. 2012. Image Melding: Combining Inconsistent Images using Patch-based Synthesis. ACM Transactions on Graphics (Proceedings of SIGGRAPH) 31, 4, Article 82 (2012), 82:1--82:10 pages.Google ScholarGoogle Scholar
  8. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. 2009. ImageNet: A Large-Scale Hierarchical Image Database. In CVPR09.Google ScholarGoogle Scholar
  9. Yue Deng, Qionghai Dai, and Zengke Zhang. 2011. Graph Laplace for occluded face completion and recognition. IEEE Transactions on Image Processing 20, 8 (2011), 2329--2338. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Iddo Drori, Daniel Cohen-Or, and Hezy Yeshurun. 2003. Fragment-based Image Completion. ACM Transactions on Graphics (Proceedings of SIGGRAPH) 22, 3 (2003), 303--312. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Alexei Efros and Thomas Leung. 1999. Texture Synthesis by Non-parametric Sampling. In International Conference on Computer Vision. 1033--1038. Google ScholarGoogle ScholarCross RefCross Ref
  12. Alexei A. Efros and William T. Freeman. 2001. Image Quilting for Texture Synthesis and Transfer. In ACM Transactions on Graphics (Proceedings of SIGGRAPH). 341--346. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Kunihiko Fukushima. 1988. Neocognitron: A hierarchical neural network capable of visual pattern recognition. Neural networks 1, 2 (1988), 119--130. Google ScholarGoogle ScholarCross RefCross Ref
  14. Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron C. Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In Conference on Neural Information Processing Systems. 2672--2680.Google ScholarGoogle Scholar
  15. James Hays and Alexei A. Efros. 2007. Scene Completion Using Millions of Photographs. ACM Transactions on Graphics (Proceedings of SIGGRAPH) 26, 3, Article 4 (2007). Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Kaiming He and Jian Sun. 2012. Statistics of Patch Offsets for Image Completion. In European Conference on Computer Vision. 16--29. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Jia-Bin Huang, Sing Bing Kang, Narendra Ahuja, and Johannes Kopf. 2014. Image Completion Using Planar Structure Guidance. ACM Transactions on Graphics (Proceedings of SIGGRAPH) 33, 4, Article 129 (2014), 10 pages.Google ScholarGoogle Scholar
  18. Sergey Ioffe and Christian Szegedy. 2015. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In International Conference on Machine Learning.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-Image Translation with Conditional Adversarial Networks. (2017).Google ScholarGoogle Scholar
  20. Jiaya Jia and Chi-Keung Tang. 2003. Image repairing: robust image synthesis by adaptive ND tensor voting. In IEEE Conference on Computer Vision and Pattern Recognition, Vol. 1. 643--650. Google ScholarGoogle ScholarCross RefCross Ref
  21. Rolf Köhler, Christian Schuler, Bernhard Schölkopf, and Stefan Harmeling. 2014. Mask-specific inpainting with deep neural networks. In German Conference on Pattern Recognition. Google ScholarGoogle ScholarCross RefCross Ref
  22. Johannes Kopf, Wolf Kienzle, Steven Drucker, and Sing Bing Kang. 2012. Quality Prediction for Image Completion. ACM Transactions on Graphics (Proceedings of SIGGRAPH Asia) 31, 6, Article 131 (2012), 8 pages.Google ScholarGoogle Scholar
  23. Vivek Kwatra, Irfan Essa, Aaron Bobick, and Nipun Kwatra. 2005. Texture Optimization for Example-based Synthesis. ACM Transactions on Graphics (Proceedings of SIGGRAPH) 24, 3 (July 2005), 795--802. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Vivek Kwatra, Arno Schödl, Irfan Essa, Greg Turk, and Aaron Bobick. 2003. Graphcut Textures: Image and Video Synthesis Using Graph Cuts. ACM Transactions on Graphics (Proceedings of SIGGRAPH) 22, 3 (July 2003), 277--286. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Yann LeCun, Bernhard Boser, John S Denker, Donnie Henderson, Richard E Howard, Wayne Hubbard, and Lawrence D Jackel. 1989. Backpropagation applied to handwritten zip code recognition. Neural computation 1, 4 (1989), 541--551. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Anat Levin, Assaf Zomet, and Yair Weiss. 2003. Learning How to Inpaint from Global Image Statistics. In International Conference on Computer Vision. 305--312. Google ScholarGoogle ScholarCross RefCross Ref
  27. Rongjian Li, Wenlu Zhang, Heung-Il Suk, Li Wang, Jiang Li, Dinggang Shen, and Shuiwang Ji. 2014. Deep learning based imaging data completion for improved brain disease diagnosis. In International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 305--312. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. 2015. Deep Learning Face Attributes in the Wild. In International Conference on Computer Vision. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In IEEE Conference on Computer Vision and Pattern Recognition. Google ScholarGoogle ScholarCross RefCross Ref
  30. Umar Mohammed, Simon JD Prince, and Jan Kautz. 2009. Visio-lization: generating novel facial images. ACM Transactions on Graphics (Proceedings of SIGGRAPH) 28, 3 (2009), 57.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Vinod Nair and Geoffrey E Hinton. 2010. Rectified linear units improve restricted boltzmann machines. In International Conference on Machine Learning. 807--814.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Deepak Pathak, Philipp Krähenbühl, Jeff Donahue, Trevor Darrell, and Alexei Efros. 2016. Context Encoders: Feature Learning by Inpainting. In IEEE Conference on Computer Vision and Pattern Recognition. Google ScholarGoogle ScholarCross RefCross Ref
  33. Darko Pavić, Volker Schönefeld, and Leif Kobbelt. 2006. Interactive image completion with perspective correction. The Visual Computer 22, 9 (2006), 671--681. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Patrick Pérez, Michel Gangnet, and Andrew Blake. 2003. Poisson Image Editing. ACM Transactions on Graphics (Proceedings of SIGGRAPH) 22, 3 (July 2003), 313--318. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Alec Radford, Luke Metz, and Soumith Chintala. 2016. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. In International Conference on Learning Representations.Google ScholarGoogle Scholar
  36. Radim Šára Radim Tyleček. 2013. Spatial Pattern Templates for Recognition of Objects with Regular Structure. In German Conference on Pattern Recognition. Saarbrucken, Germany.Google ScholarGoogle Scholar
  37. Jimmy SJ Ren, Li Xu, Qiong Yan, and Wenxiu Sun. 2015. Shepard Convolutional Neural Networks. In Conference on Neural Information Processing Systems.Google ScholarGoogle Scholar
  38. D.E. Rumelhart, G.E. Hinton, and R.J. Williams. 1986. Learning representations by back-propagating errors. In Nature. Google ScholarGoogle ScholarCross RefCross Ref
  39. Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. 2016. Improved techniques for training gans. In Conference on Neural Information Processing Systems.Google ScholarGoogle Scholar
  40. Denis Simakov, Yaron Caspi, Eli Shechtman, and Michal Irani. 2008. Summarizing visual data using bidirectional similarity. In IEEE Conference on Computer Vision and Pattern Recognition. 1--8. Google ScholarGoogle ScholarCross RefCross Ref
  41. Jian Sun, Lu Yuan, Jiaya Jia, and Heung-Yeung Shum. 2005. Image Completion with Structure Propagation. ACM Transactions on Graphics (Proceedings of SIGGRAPH) 24, 3 (July 2005), 861--868. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Alexandru Telea. 2004. An Image Inpainting Technique Based on the Fast Marching Method. Journal of Graphics Tools 9, 1 (2004), 23--34. Google ScholarGoogle ScholarCross RefCross Ref
  43. Yonatan Wexler, Eli Shechtman, and Michal Irani. 2007. Space-Time Completion of Video. IEEE Transactions on Pattern Analysis and Machine Intelligence 29, 3 (2007), 463--476. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Oliver Whyte, Josef Sivic, and Andrew Zisserman. 2009. Get Out of my Picture! Internet-based Inpainting. In British Machine Vision Conference. Google ScholarGoogle ScholarCross RefCross Ref
  45. Junyuan Xie, Linli Xu, and Enhong Chen. 2012. Image Denoising and Inpainting with Deep Neural Networks. In Conference on Neural Information Processing Systems. 341--349. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Chao Yang, Xin Lu, Zhe Lin, Eli Shechtman, Oliver Wang, and Hao Li. 2017. High-Resolution Image Inpainting using Multi-Scale Neural Patch Synthesis. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle ScholarCross RefCross Ref
  47. Fisher Yu and Vladlen Koltun. 2016. Multi-Scale Context Aggregation by Dilated Convolutions. In International Conference on Learning Representations.Google ScholarGoogle Scholar
  48. Matthew D. Zeiler. 2012. ADADELTA: An Adaptive Learning Rate Method. CoRR abs/1212.5701 (2012).Google ScholarGoogle Scholar
  49. Bolei Zhou, Aditya Khosla, Àgata Lapedriza, Antonio Torralba, and Aude Oliva. 2016. Places: An Image Database for Deep Scene Understanding. CoRR abs/1610.02055 (2016).Google ScholarGoogle Scholar

Index Terms

  1. Globally and locally consistent image completion

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Graphics
        ACM Transactions on Graphics  Volume 36, Issue 4
        August 2017
        2155 pages
        ISSN:0730-0301
        EISSN:1557-7368
        DOI:10.1145/3072959
        Issue’s Table of Contents

        Copyright © 2017 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 20 July 2017
        Published in tog Volume 36, Issue 4

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader