Supervised Initialization of LSTM Networks for Fundamental Frequency Detection in Noisy Speech Signals

Coto-Jimenez, Marvin

Full-text links:

Download:

Current browse context:

eess.AS

< prev | next >

new | recent | 1911

Computer Science > Sound

Title: Supervised Initialization of LSTM Networks for Fundamental Frequency Detection in Noisy Speech Signals

Authors: Marvin Coto-Jimenez

(Submitted on 11 Nov 2019)

Abstract: Fundamental frequency is one of the most important parameters of human speech, of importance for the classification of accent, gender, speaking styles, speaker identification, age, among others. The proper detection of this parameter remains as an important challenge for severely degraded signals. In previous references for detecting fundamental frequency in noisy speech using deep learning, the networks, such as Long Short-term Memory (LSTM) has been initialized with random weights, and then trained following a back-propagation through time algorithm. In this work, a proposal for a more efficient initialization, based on a supervised training using an Auto-associative network, is presented. This initialization is a better starting point for the detection of fundamental frequency in noisy speech. The advantages of this initialization are noticeable using objective measures for the accuracy of the detection and for the training of the networks, under the presence of additive white noise at different signal-to-noise levels.

Subjects:	Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:1911.04580 [cs.SD]
	(or arXiv:1911.04580v1 [cs.SD] for this version)

Submission history

From: Marvin Coto Mr. [view email]
[v1] Mon, 11 Nov 2019 21:57:29 GMT (305kb)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:1911.04580

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Computer Science > Sound

Title: Supervised Initialization of LSTM Networks for Fundamental Frequency Detection in Noisy Speech Signals

Submission history