Breaking Speech Recognizers to Imagine Lyrics

Gillick, Jon; Bamman, David

Full-text links:

Download:

Current browse context:

cs.HC

< prev | next >

new | recent | 1912

Computer Science > Human-Computer Interaction

Title: Breaking Speech Recognizers to Imagine Lyrics

Authors: Jon Gillick, David Bamman

(Submitted on 15 Dec 2019)

Abstract: We introduce a new method for generating text, and in particular song lyrics, based on the speech-like acoustic qualities of a given audio file. We repurpose a vocal source separation algorithm and an acoustic model trained to recognize isolated speech, instead inputting instrumental music or environmental sounds. Feeding the "mistakes" of the vocal separator into the recognizer, we obtain a transcription of words \emph{imagined} to be spoken in the input audio. We describe the key components of our approach, present initial analysis, and discuss the potential of the method for machine-in-the-loop collaboration in creative applications.

Comments:	3 pages
Subjects:	Human-Computer Interaction (cs.HC); Computation and Language (cs.CL); Machine Learning (cs.LG)
Journal reference:	NeurIPS 2019 Workshop on Machine Learning for Creativity and Design
Cite as:	arXiv:1912.06979 [cs.HC]
	(or arXiv:1912.06979v1 [cs.HC] for this version)

Submission history

From: Jon Gillick [view email]
[v1] Sun, 15 Dec 2019 05:34:45 GMT (99kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:1912.06979

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Human-Computer Interaction

Title: Breaking Speech Recognizers to Imagine Lyrics

Submission history