Abstract
We present a novel approach for image completion that results in images that are both locally and globally consistent. With a fully-convolutional neural network, we can complete images of arbitrary resolutions by filling-in missing regions of any shape. To train this image completion network to be consistent, we use global and local context discriminators that are trained to distinguish real images from completed ones. The global discriminator looks at the entire image to assess if it is coherent as a whole, while the local discriminator looks only at a small area centered at the completed region to ensure the local consistency of the generated patches. The image completion network is then trained to fool the both context discriminator networks, which requires it to generate images that are indistinguishable from real ones with regard to overall consistency as well as in details. We show that our approach can be used to complete a wide variety of scenes. Furthermore, in contrast with the patch-based approaches such as PatchMatch, our approach can generate fragments that do not appear elsewhere in the image, which allows us to naturally complete the images of objects with familiar and highly specific structures, such as faces.
Supplemental Material
- Coloma Ballester, Marcelo Bertalmío, Vicent Caselles, Guillermo Sapiro, and Joan Verdera. 2001. Filling-in by joint interpolation of vector fields and gray levels. IEEE Transactions on Image Processing 10, 8 (2001), 1200--1211. Google ScholarDigital Library
- Connelly Barnes, Eli Shechtman, Adam Finkelstein, and Dan B Goldman. 2009. Patch-Match: A Randomized Correspondence Algorithm for Structural Image Editing. ACM Transactions on Graphics (Proceedings of SIGGRAPH) 28, 3 (2009), 24:1--24:11.Google Scholar
- Connelly Barnes, Eli Shechtman, Dan B. Goldman, and Adam Finkelstein. 2010. The Generalized Patchmatch Correspondence Algorithm. In European Conference on Computer Vision. 29--43. Google ScholarCross Ref
- Marcelo Bertalmio, Guillermo Sapiro, Vincent Caselles, and Coloma Ballester. 2000. Image Inpainting. In ACM Transactions on Graphics (Proceedings of SIGGRAPH). 417--424. Google ScholarDigital Library
- M. Bertalmio, L. Vese, G. Sapiro, and S. Osher. 2003. Simultaneous structure and texture image inpainting. IEEE Transactions on Image Processing 12, 8 (2003), 882--889. Google ScholarDigital Library
- A. Criminisi, P. Perez, and K. Toyama. 2004. Region Filling and Object Removal by Exemplar-based Image Inpainting. IEEE Transactions on Image Processing 13, 9 (2004), 1200--1212. Google ScholarDigital Library
- Soheil Darabi, Eli Shechtman, Connelly Barnes, Dan B Goldman, and Pradeep Sen. 2012. Image Melding: Combining Inconsistent Images using Patch-based Synthesis. ACM Transactions on Graphics (Proceedings of SIGGRAPH) 31, 4, Article 82 (2012), 82:1--82:10 pages.Google Scholar
- J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. 2009. ImageNet: A Large-Scale Hierarchical Image Database. In CVPR09.Google Scholar
- Yue Deng, Qionghai Dai, and Zengke Zhang. 2011. Graph Laplace for occluded face completion and recognition. IEEE Transactions on Image Processing 20, 8 (2011), 2329--2338. Google ScholarDigital Library
- Iddo Drori, Daniel Cohen-Or, and Hezy Yeshurun. 2003. Fragment-based Image Completion. ACM Transactions on Graphics (Proceedings of SIGGRAPH) 22, 3 (2003), 303--312. Google ScholarDigital Library
- Alexei Efros and Thomas Leung. 1999. Texture Synthesis by Non-parametric Sampling. In International Conference on Computer Vision. 1033--1038. Google ScholarCross Ref
- Alexei A. Efros and William T. Freeman. 2001. Image Quilting for Texture Synthesis and Transfer. In ACM Transactions on Graphics (Proceedings of SIGGRAPH). 341--346. Google ScholarDigital Library
- Kunihiko Fukushima. 1988. Neocognitron: A hierarchical neural network capable of visual pattern recognition. Neural networks 1, 2 (1988), 119--130. Google ScholarCross Ref
- Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron C. Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In Conference on Neural Information Processing Systems. 2672--2680.Google Scholar
- James Hays and Alexei A. Efros. 2007. Scene Completion Using Millions of Photographs. ACM Transactions on Graphics (Proceedings of SIGGRAPH) 26, 3, Article 4 (2007). Google ScholarDigital Library
- Kaiming He and Jian Sun. 2012. Statistics of Patch Offsets for Image Completion. In European Conference on Computer Vision. 16--29. Google ScholarDigital Library
- Jia-Bin Huang, Sing Bing Kang, Narendra Ahuja, and Johannes Kopf. 2014. Image Completion Using Planar Structure Guidance. ACM Transactions on Graphics (Proceedings of SIGGRAPH) 33, 4, Article 129 (2014), 10 pages.Google Scholar
- Sergey Ioffe and Christian Szegedy. 2015. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In International Conference on Machine Learning.Google ScholarDigital Library
- Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-Image Translation with Conditional Adversarial Networks. (2017).Google Scholar
- Jiaya Jia and Chi-Keung Tang. 2003. Image repairing: robust image synthesis by adaptive ND tensor voting. In IEEE Conference on Computer Vision and Pattern Recognition, Vol. 1. 643--650. Google ScholarCross Ref
- Rolf Köhler, Christian Schuler, Bernhard Schölkopf, and Stefan Harmeling. 2014. Mask-specific inpainting with deep neural networks. In German Conference on Pattern Recognition. Google ScholarCross Ref
- Johannes Kopf, Wolf Kienzle, Steven Drucker, and Sing Bing Kang. 2012. Quality Prediction for Image Completion. ACM Transactions on Graphics (Proceedings of SIGGRAPH Asia) 31, 6, Article 131 (2012), 8 pages.Google Scholar
- Vivek Kwatra, Irfan Essa, Aaron Bobick, and Nipun Kwatra. 2005. Texture Optimization for Example-based Synthesis. ACM Transactions on Graphics (Proceedings of SIGGRAPH) 24, 3 (July 2005), 795--802. Google ScholarDigital Library
- Vivek Kwatra, Arno Schödl, Irfan Essa, Greg Turk, and Aaron Bobick. 2003. Graphcut Textures: Image and Video Synthesis Using Graph Cuts. ACM Transactions on Graphics (Proceedings of SIGGRAPH) 22, 3 (July 2003), 277--286. Google ScholarDigital Library
- Yann LeCun, Bernhard Boser, John S Denker, Donnie Henderson, Richard E Howard, Wayne Hubbard, and Lawrence D Jackel. 1989. Backpropagation applied to handwritten zip code recognition. Neural computation 1, 4 (1989), 541--551. Google ScholarDigital Library
- Anat Levin, Assaf Zomet, and Yair Weiss. 2003. Learning How to Inpaint from Global Image Statistics. In International Conference on Computer Vision. 305--312. Google ScholarCross Ref
- Rongjian Li, Wenlu Zhang, Heung-Il Suk, Li Wang, Jiang Li, Dinggang Shen, and Shuiwang Ji. 2014. Deep learning based imaging data completion for improved brain disease diagnosis. In International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 305--312. Google ScholarDigital Library
- Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. 2015. Deep Learning Face Attributes in the Wild. In International Conference on Computer Vision. Google ScholarDigital Library
- Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In IEEE Conference on Computer Vision and Pattern Recognition. Google ScholarCross Ref
- Umar Mohammed, Simon JD Prince, and Jan Kautz. 2009. Visio-lization: generating novel facial images. ACM Transactions on Graphics (Proceedings of SIGGRAPH) 28, 3 (2009), 57.Google ScholarDigital Library
- Vinod Nair and Geoffrey E Hinton. 2010. Rectified linear units improve restricted boltzmann machines. In International Conference on Machine Learning. 807--814.Google ScholarDigital Library
- Deepak Pathak, Philipp Krähenbühl, Jeff Donahue, Trevor Darrell, and Alexei Efros. 2016. Context Encoders: Feature Learning by Inpainting. In IEEE Conference on Computer Vision and Pattern Recognition. Google ScholarCross Ref
- Darko Pavić, Volker Schönefeld, and Leif Kobbelt. 2006. Interactive image completion with perspective correction. The Visual Computer 22, 9 (2006), 671--681. Google ScholarDigital Library
- Patrick Pérez, Michel Gangnet, and Andrew Blake. 2003. Poisson Image Editing. ACM Transactions on Graphics (Proceedings of SIGGRAPH) 22, 3 (July 2003), 313--318. Google ScholarDigital Library
- Alec Radford, Luke Metz, and Soumith Chintala. 2016. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. In International Conference on Learning Representations.Google Scholar
- Radim Šára Radim Tyleček. 2013. Spatial Pattern Templates for Recognition of Objects with Regular Structure. In German Conference on Pattern Recognition. Saarbrucken, Germany.Google Scholar
- Jimmy SJ Ren, Li Xu, Qiong Yan, and Wenxiu Sun. 2015. Shepard Convolutional Neural Networks. In Conference on Neural Information Processing Systems.Google Scholar
- D.E. Rumelhart, G.E. Hinton, and R.J. Williams. 1986. Learning representations by back-propagating errors. In Nature. Google ScholarCross Ref
- Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. 2016. Improved techniques for training gans. In Conference on Neural Information Processing Systems.Google Scholar
- Denis Simakov, Yaron Caspi, Eli Shechtman, and Michal Irani. 2008. Summarizing visual data using bidirectional similarity. In IEEE Conference on Computer Vision and Pattern Recognition. 1--8. Google ScholarCross Ref
- Jian Sun, Lu Yuan, Jiaya Jia, and Heung-Yeung Shum. 2005. Image Completion with Structure Propagation. ACM Transactions on Graphics (Proceedings of SIGGRAPH) 24, 3 (July 2005), 861--868. Google ScholarDigital Library
- Alexandru Telea. 2004. An Image Inpainting Technique Based on the Fast Marching Method. Journal of Graphics Tools 9, 1 (2004), 23--34. Google ScholarCross Ref
- Yonatan Wexler, Eli Shechtman, and Michal Irani. 2007. Space-Time Completion of Video. IEEE Transactions on Pattern Analysis and Machine Intelligence 29, 3 (2007), 463--476. Google ScholarDigital Library
- Oliver Whyte, Josef Sivic, and Andrew Zisserman. 2009. Get Out of my Picture! Internet-based Inpainting. In British Machine Vision Conference. Google ScholarCross Ref
- Junyuan Xie, Linli Xu, and Enhong Chen. 2012. Image Denoising and Inpainting with Deep Neural Networks. In Conference on Neural Information Processing Systems. 341--349. Google ScholarDigital Library
- Chao Yang, Xin Lu, Zhe Lin, Eli Shechtman, Oliver Wang, and Hao Li. 2017. High-Resolution Image Inpainting using Multi-Scale Neural Patch Synthesis. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
- Fisher Yu and Vladlen Koltun. 2016. Multi-Scale Context Aggregation by Dilated Convolutions. In International Conference on Learning Representations.Google Scholar
- Matthew D. Zeiler. 2012. ADADELTA: An Adaptive Learning Rate Method. CoRR abs/1212.5701 (2012).Google Scholar
- Bolei Zhou, Aditya Khosla, Àgata Lapedriza, Antonio Torralba, and Aude Oliva. 2016. Places: An Image Database for Deep Scene Understanding. CoRR abs/1610.02055 (2016).Google Scholar
Index Terms
- Globally and locally consistent image completion
Recommendations
Iterative applications of image completion with CNN-based failure detection
Highlights- A framework of uman-in-the-loop style image completion is proposed.
- Failure ...
AbstractImage completion is a technique to fill missing regions in a damaged or redacted image. A patch-based approach is one of major approaches, which solves an optimization problem that involves pixel values in missing regions and similar ...
Interactive image completion with perspective constraint
VRCAI '12: Proceedings of the 11th ACM SIGGRAPH International Conference on Virtual-Reality Continuum and its Applications in IndustryIn this paper, we present a portable interactive system for image completion with perspective constraint. Image completion arises in many image filling and editing problems, but it is seldom applied in the scenario regarding to the features of ...
Image completion based on views of large displacement
This paper presents an algorithm for image completion based on the views of large displacement. A distinct from most existing image completion methods, which exploit only the target image’s own information to complete the damaged regions, our algorithm ...
Comments