Genre-conditioned Acoustic Models for Automatic Lyrics Transcription of Polyphonic Music

Gao, Xiaoxue; Gupta, Chitralekha; Li, Haizhou

Full-text links:

Download:

Current browse context:

cs.SD

< prev | next >

new | recent | 2204

Computer Science > Sound

Title: Genre-conditioned Acoustic Models for Automatic Lyrics Transcription of Polyphonic Music

Authors: Xiaoxue Gao, Chitralekha Gupta, Haizhou Li

(Submitted on 7 Apr 2022)

Abstract: Lyrics transcription of polyphonic music is challenging not only because the singing vocals are corrupted by the background music, but also because the background music and the singing style vary across music genres, such as pop, metal, and hip hop, which affects lyrics intelligibility of the song in different ways. In this work, we propose to transcribe the lyrics of polyphonic music using a novel genre-conditioned network. The proposed network adopts pre-trained model parameters, and incorporates the genre adapters between layers to capture different genre peculiarities for lyrics-genre pairs, thereby only requiring lightweight genre-specific parameters for training. Our experiments show that the proposed genre-conditioned network outperforms the existing lyrics transcription systems.

Comments:	5 pages, 1 figure, accepted by IEEE ICASSP 2022
Subjects:	Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2204.03307 [cs.SD]
	(or arXiv:2204.03307v1 [cs.SD] for this version)

Submission history

From: Xiaoxue Gao [view email]
[v1] Thu, 7 Apr 2022 09:15:46 GMT (132kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2204.03307

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Sound

Title: Genre-conditioned Acoustic Models for Automatic Lyrics Transcription of Polyphonic Music

Submission history