Multilingual Text-to-Speech Synthesis for Turkic Languages Using Transliteration

Yeshpanov, Rustem; Mussakhojayeva, Saida; Khassanov, Yerbolat

Full-text links:

Download:

Current browse context:

eess.AS

< prev | next >

new | recent | 2305

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: Multilingual Text-to-Speech Synthesis for Turkic Languages Using Transliteration

Authors: Rustem Yeshpanov, Saida Mussakhojayeva, Yerbolat Khassanov

(Submitted on 25 May 2023)

Abstract: This work aims to build a multilingual text-to-speech (TTS) synthesis system for ten lower-resourced Turkic languages: Azerbaijani, Bashkir, Kazakh, Kyrgyz, Sakha, Tatar, Turkish, Turkmen, Uyghur, and Uzbek. We specifically target the zero-shot learning scenario, where a TTS model trained using the data of one language is applied to synthesise speech for other, unseen languages. An end-to-end TTS system based on the Tacotron 2 architecture was trained using only the available data of the Kazakh language. To generate speech for the other Turkic languages, we first mapped the letters of the Turkic alphabets onto the symbols of the International Phonetic Alphabet (IPA), which were then converted to the Kazakh alphabet letters. To demonstrate the feasibility of the proposed approach, we evaluated the multilingual Turkic TTS model subjectively and obtained promising results. To enable replication of the experiments, we make our code and dataset publicly available in our GitHub repository.

Comments:	5 pages, 1 figure, 3 tables, accepted to Interspeech
Subjects:	Audio and Speech Processing (eess.AS); Computation and Language (cs.CL)
Cite as:	arXiv:2305.15749 [eess.AS]
	(or arXiv:2305.15749v1 [eess.AS] for this version)

Submission history

From: Yerbolat Khassanov [view email]
[v1] Thu, 25 May 2023 05:57:54 GMT (231kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> eess > arXiv:2305.15749

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: Multilingual Text-to-Speech Synthesis for Turkic Languages Using Transliteration

Submission history