Normalized Avatar Synthesis Using StyleGAN and Perceptual Refinement

Luo, Huiwen; Nagano, Koki; Kung, Han-Wei; Goldwhite, Mclean; Xu, Qingguo; Wang, Zejian; Wei, Lingyu; Hu, Liwen; Li, Hao

Full-text links:

Download:

Current browse context:

cs.CV

< prev | next >

new | recent | 2106

Computer Science > Computer Vision and Pattern Recognition

Title: Normalized Avatar Synthesis Using StyleGAN and Perceptual Refinement

Authors: Huiwen Luo, Koki Nagano, Han-Wei Kung, Mclean Goldwhite, Qingguo Xu, Zejian Wang, Lingyu Wei, Liwen Hu, Hao Li

(Submitted on 21 Jun 2021)

Abstract: We introduce a highly robust GAN-based framework for digitizing a normalized 3D avatar of a person from a single unconstrained photo. While the input image can be of a smiling person or taken in extreme lighting conditions, our method can reliably produce a high-quality textured model of a person's face in neutral expression and skin textures under diffuse lighting condition. Cutting-edge 3D face reconstruction methods use non-linear morphable face models combined with GAN-based decoders to capture the likeness and details of a person but fail to produce neutral head models with unshaded albedo textures which is critical for creating relightable and animation-friendly avatars for integration in virtual environments. The key challenges for existing methods to work is the lack of training and ground truth data containing normalized 3D faces. We propose a two-stage approach to address this problem. First, we adopt a highly robust normalized 3D face generator by embedding a non-linear morphable face model into a StyleGAN2 network. This allows us to generate detailed but normalized facial assets. This inference is then followed by a perceptual refinement step that uses the generated assets as regularization to cope with the limited available training samples of normalized faces. We further introduce a Normalized Face Dataset, which consists of a combination photogrammetry scans, carefully selected photographs, and generated fake people with neutral expressions in diffuse lighting conditions. While our prepared dataset contains two orders of magnitude less subjects than cutting edge GAN-based 3D facial reconstruction methods, we show that it is possible to produce high-quality normalized face models for very challenging unconstrained input images, and demonstrate superior performance to the current state-of-the-art.

Comments:	Accepted to CVPR 2021
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
Cite as:	arXiv:2106.11423 [cs.CV]
	(or arXiv:2106.11423v1 [cs.CV] for this version)

Submission history

From: Liwen Hu [view email]
[v1] Mon, 21 Jun 2021 21:57:16 GMT (30385kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2106.11423

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computer Vision and Pattern Recognition

Title: Normalized Avatar Synthesis Using StyleGAN and Perceptual Refinement

Submission history