Cross-Lingual Text-to-Speech Using Multi-Task Learning and Speaker Classifier Joint Training

Yang, J.; He, Lei

Full-text links:

Download:

Current browse context:

cs.SD

< prev | next >

new | recent | 2201

Computer Science > Sound

Title: Cross-Lingual Text-to-Speech Using Multi-Task Learning and Speaker Classifier Joint Training

Authors: J. Yang, Lei He

(Submitted on 20 Jan 2022)

Abstract: In cross-lingual speech synthesis, the speech in various languages can be synthesized for a monoglot speaker. Normally, only the data of monoglot speakers are available for model training, thus the speaker similarity is relatively low between the synthesized cross-lingual speech and the native language recordings. Based on the multilingual transformer text-to-speech model, this paper studies a multi-task learning framework to improve the cross-lingual speaker similarity. To further improve the speaker similarity, joint training with a speaker classifier is proposed. Here, a scheme similar to parallel scheduled sampling is proposed to train the transformer model efficiently to avoid breaking the parallel training mechanism when introducing joint training. By using multi-task learning and speaker classifier joint training, in subjective and objective evaluations, the cross-lingual speaker similarity can be consistently improved for both the seen and unseen speakers in the training set.

Subjects:	Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2201.08124 [cs.SD]
	(or arXiv:2201.08124v1 [cs.SD] for this version)

Submission history

From: Jingzhou Yang [view email]
[v1] Thu, 20 Jan 2022 12:02:58 GMT (757kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2201.08124

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Sound

Title: Cross-Lingual Text-to-Speech Using Multi-Task Learning and Speaker Classifier Joint Training

Submission history