Aggregated Contextual Transformations for High-Resolution Image Inpainting

Zeng, Yanhong; Fu, Jianlong; Chao, Hongyang; Guo, Baining

Full-text links:

Download:

Current browse context:

cs.CV

< prev | next >

new | recent | 2104

Change to browse by:

Computer Science > Computer Vision and Pattern Recognition

Title: Aggregated Contextual Transformations for High-Resolution Image Inpainting

Authors: Yanhong Zeng, Jianlong Fu, Hongyang Chao, Baining Guo

(Submitted on 3 Apr 2021)

Abstract: State-of-the-art image inpainting approaches can suffer from generating distorted structures and blurry textures in high-resolution images (e.g., 512x512). The challenges mainly drive from (1) image content reasoning from distant contexts, and (2) fine-grained texture synthesis for a large missing region. To overcome these two challenges, we propose an enhanced GAN-based model, named Aggregated COntextual-Transformation GAN (AOT-GAN), for high-resolution image inpainting. Specifically, to enhance context reasoning, we construct the generator of AOT-GAN by stacking multiple layers of a proposed AOT block. The AOT blocks aggregate contextual transformations from various receptive fields, allowing to capture both informative distant image contexts and rich patterns of interest for context reasoning. For improving texture synthesis, we enhance the discriminator of AOT-GAN by training it with a tailored mask-prediction task. Such a training objective forces the discriminator to distinguish the detailed appearances of real and synthesized patches, and in turn, facilitates the generator to synthesize clear textures. Extensive comparisons on Places2, the most challenging benchmark with 1.8 million high-resolution images of 365 complex scenes, show that our model outperforms the state-of-the-art by a significant margin in terms of FID with 38.60% relative improvement. A user study including more than 30 subjects further validates the superiority of AOT-GAN. We further evaluate the proposed AOT-GAN in practical applications, e.g., logo removal, face editing, and object removal. Results show that our model achieves promising completions in the real world. We release code and models in this https URL

Comments:	This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2104.01431 [cs.CV]
	(or arXiv:2104.01431v1 [cs.CV] for this version)

Submission history

From: Yanhong Zeng [view email]
[v1] Sat, 3 Apr 2021 15:50:17 GMT (4855kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2104.01431

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computer Vision and Pattern Recognition

Title: Aggregated Contextual Transformations for High-Resolution Image Inpainting

Submission history