BayesSpeech: A Bayesian Transformer Network for Automatic Speech Recognition

Rieger, Will

Full-text links:

Download:

Current browse context:

eess.AS

< prev | next >

new | recent | 2301

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: BayesSpeech: A Bayesian Transformer Network for Automatic Speech Recognition

Authors: Will Rieger

(Submitted on 16 Jan 2023)

Abstract: Recent developments using End-to-End Deep Learning models have been shown to have near or better performance than state of the art Recurrent Neural Networks (RNNs) on Automatic Speech Recognition tasks. These models tend to be lighter weight and require less training time than traditional RNN-based approaches. However, these models take frequentist approach to weight training. In theory, network weights are drawn from a latent, intractable probability distribution. We introduce BayesSpeech for end-to-end Automatic Speech Recognition. BayesSpeech is a Bayesian Transformer Network where these intractable posteriors are learned through variational inference and the local reparameterization trick without recurrence. We show how the introduction of variance in the weights leads to faster training time and near state-of-the-art performance on LibriSpeech-960.

Subjects:	Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
Cite as:	arXiv:2301.11276 [eess.AS]
	(or arXiv:2301.11276v1 [eess.AS] for this version)

Submission history

From: William Rieger [view email]
[v1] Mon, 16 Jan 2023 16:19:04 GMT (651kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> eess > arXiv:2301.11276

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: BayesSpeech: A Bayesian Transformer Network for Automatic Speech Recognition

Submission history