Music-robust Automatic Lyrics Transcription of Polyphonic Music

Gao, Xiaoxue; Gupta, Chitralekha; Li, Haizhou

Full-text links:

Download:

Current browse context:

eess.AS

< prev | next >

new | recent | 2204

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: Music-robust Automatic Lyrics Transcription of Polyphonic Music

Authors: Xiaoxue Gao, Chitralekha Gupta, Haizhou Li

(Submitted on 7 Apr 2022 (v1), last revised 22 Apr 2022 (this version, v2))

Abstract: Lyrics transcription of polyphonic music is challenging because singing vocals are corrupted by the background music. To improve the robustness of lyrics transcription to the background music, we propose a strategy of combining the features that emphasize the singing vocals, i.e. music-removed features that represent singing vocal extracted features, and the features that capture the singing vocals as well as the background music, i.e. music-present features. We show that these two sets of features complement each other, and their combination performs better than when they are used alone, thus improving the robustness of the acoustic model to the background music. Furthermore, language model interpolation between a general-purpose language model and an in-domain lyrics-specific language model provides further improvement in transcription results. Our experiments show that our proposed strategy outperforms the existing lyrics transcription systems for polyphonic music. Moreover, we find that our proposed music-robust features specially improve the lyrics transcription performance in metal genre of songs, where the background music is loud and dominant.

Comments:	7 pages, 2 figures, accepted by 2022 Sound and Music Computing
Subjects:	Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2204.03306 [eess.AS]
	(or arXiv:2204.03306v2 [eess.AS] for this version)

Submission history

From: Xiaoxue Gao [view email]
[v1] Thu, 7 Apr 2022 09:14:58 GMT (118kb,D)
[v2] Fri, 22 Apr 2022 12:06:57 GMT (118kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> eess > arXiv:2204.03306

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: Music-robust Automatic Lyrics Transcription of Polyphonic Music

Submission history