Simple and Effective Unsupervised Speech Synthesis

Liu, Alexander H.; Lai, Cheng-I Jeff; Hsu, Wei-Ning; Auli, Michael; Baevski, Alexei; Glass, James

Full-text links:

Download:

Current browse context:

cs.SD

< prev | next >

new | recent | 2204

Computer Science > Sound

Title: Simple and Effective Unsupervised Speech Synthesis

Authors: Alexander H. Liu, Cheng-I Jeff Lai, Wei-Ning Hsu, Michael Auli, Alexei Baevski, James Glass

(Submitted on 6 Apr 2022 (v1), last revised 20 Apr 2022 (this version, v3))

Abstract: We introduce the first unsupervised speech synthesis system based on a simple, yet effective recipe. The framework leverages recent work in unsupervised speech recognition as well as existing neural-based speech synthesis. Using only unlabeled speech audio and unlabeled text as well as a lexicon, our method enables speech synthesis without the need for a human-labeled corpus. Experiments demonstrate the unsupervised system can synthesize speech similar to a supervised counterpart in terms of naturalness and intelligibility measured by human evaluation.

Comments:	preprint, equal contribution from first two authors
Subjects:	Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2204.02524 [cs.SD]
	(or arXiv:2204.02524v3 [cs.SD] for this version)

Submission history

From: Alexander H. Liu [view email]
[v1] Wed, 6 Apr 2022 00:19:13 GMT (571kb,D)
[v2] Thu, 7 Apr 2022 02:46:21 GMT (571kb,D)
[v3] Wed, 20 Apr 2022 17:45:35 GMT (573kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2204.02524

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Sound

Title: Simple and Effective Unsupervised Speech Synthesis

Submission history