We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

eess.AS

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: Analysis and transformations of intensity in singing voice

Abstract: In this paper we introduce a neural auto-encoder that transforms the voice intensity in recordings of singing voice. Since most recordings of singing voice are not annotated with voice intensity we propose a means to estimate the relative voice intensity from the signal's timbre using a neural intensity estimator. Two methods to overcome the unknown recording factor that relates voice intensity to recorded signal power are given: The unknown recording factor can either be learned alongside the weights of the intensity estimator, or a special loss function based on the scalar product can be used to only match the intensity contour of the recorded signal's power. The intensity models are used to condition a previously introduced bottleneck auto-encoder that disentangles its input, the mel-spectrogram, from the intensity. We evaluate the intensity models by their consistency and by their fitness to provide useful information to the auto-encoder. A perceptive test is carried out that evaluates the perceived intensity change in transformed recordings and the synthesis quality. The perceptive test confirms that changing the conditional input changes the perceived intensity accordingly thus suggesting that the proposed intensity models encode information about the voice intensity.
Comments: Submitted to Interspeech 2022
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
Cite as: arXiv:2204.04006 [eess.AS]
  (or arXiv:2204.04006v1 [eess.AS] for this version)

Submission history

From: Frederik Bous [view email]
[v1] Fri, 8 Apr 2022 11:34:33 GMT (434kb)
[v2] Tue, 22 Nov 2022 14:41:53 GMT (56kb,D)

Link back to: arXiv, form interface, contact.