Current browse context:
eess.AS
Change to browse by:
References & Citations
Electrical Engineering and Systems Science > Audio and Speech Processing
Title: A convolutional neural-network model of human cochlear mechanics and filter tuning for real-time applications
(Submitted on 30 Apr 2020 (v1), revised 18 Jun 2020 (this version, v2), latest version 4 Dec 2020 (v4))
Abstract: Auditory models are commonly used as feature extractors for automatic speech recognition systems or as front-ends for robotics, machine-hearing and hearing-aid applications. While over the years, auditory models have progressed to capture the biophysical and nonlinear properties of human hearing in great detail, these biophysical models are slow to compute and consequently not used in real-time applications. To enable an uptake, we present a hybrid approach where convolutional neural networks are combined with computational neuroscience to yield a real-time end-to-end model for human cochlear mechanics and level-dependent cochlear filter tuning (CoNNear). The CoNNear model was trained on acoustic speech material, but its performance and applicability were evaluated using (unseen) sound stimuli common in cochlear mechanics research. The CoNNear model accurately simulates human frequency selectivity and its dependence on sound intensity, which is essential for our hallmark robust speech intelligibility performance, even at negative speech-to-background noise ratios. Because its architecture is based on real-time, parallel and differentiable computations, the CoNNear model has the power to leverage real-time auditory applications towards human performance and can inspire the next generation of speech recognition, robotics and hearing-aid systems.
Submission history
From: Sarah Verhulst [view email][v1] Thu, 30 Apr 2020 14:43:03 GMT (7227kb,D)
[v2] Thu, 18 Jun 2020 20:38:38 GMT (8480kb,D)
[v3] Thu, 1 Oct 2020 12:05:56 GMT (8696kb,D)
[v4] Fri, 4 Dec 2020 20:08:14 GMT (11097kb,D)
Link back to: arXiv, form interface, contact.