We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:


Current browse context:


Change to browse by:


References & Citations

DBLP - CS Bibliography


(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Computer Vision and Pattern Recognition

Title: VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild

Abstract: We present VideoReTalking, a new system to edit the faces of a real-world talking head video according to input audio, producing a high-quality and lip-syncing output video even with a different emotion. Our system disentangles this objective into three sequential tasks: (1) face video generation with a canonical expression; (2) audio-driven lip-sync; and (3) face enhancement for improving photo-realism. Given a talking-head video, we first modify the expression of each frame according to the same expression template using the expression editing network, resulting in a video with the canonical expression. This video, together with the given audio, is then fed into the lip-sync network to generate a lip-syncing video. Finally, we improve the photo-realism of the synthesized faces through an identity-aware face enhancement network and post-processing. We use learning-based approaches for all three steps and all our modules can be tackled in a sequential pipeline without any user intervention. Furthermore, our system is a generic approach that does not need to be retrained to a specific person. Evaluations on two widely-used datasets and in-the-wild examples demonstrate the superiority of our framework over other state-of-the-art methods in terms of lip-sync accuracy and visual quality.
Comments: Accepted by SIGGRAPH Asia 2022 Conference Proceedings. Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Cite as: arXiv:2211.14758 [cs.CV]
  (or arXiv:2211.14758v1 [cs.CV] for this version)

Submission history

From: Kun Cheng [view email]
[v1] Sun, 27 Nov 2022 08:14:23 GMT (2882kb,D)

Link back to: arXiv, form interface, contact.