We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

eess.AS

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: Gated Recurrent Context: Softmax-free Attention for Online Encoder-Decoder Speech Recognition

Abstract: Recently, attention-based encoder-decoder (AED) models have shown state-of-the-art performance in automatic speech recognition (ASR). As the original AED models with global attentions are not capable of online inference, various online attention schemes have been developed to reduce ASR latency for better user experience. However, a common limitation of the conventional softmax-based online attention approaches is that they introduce an additional hyperparameter related to the length of the attention window, requiring multiple trials of model training for tuning the hyperparameter. In order to deal with this problem, we propose a novel softmax-free attention method and its modified formulation for online attention, which does not need any additional hyperparameter at the training phase. Through a number of ASR experiments, we demonstrate the tradeoff between the latency and performance of the proposed online attention technique can be controlled by merely adjusting a threshold at the test phase. Furthermore, the proposed methods showed competitive performance to the conventional global and online attentions in terms of word-error-rates (WERs).
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
DOI: 10.1109/TASLP.2021.3049344
Cite as: arXiv:2007.05214 [eess.AS]
  (or arXiv:2007.05214v3 [eess.AS] for this version)

Submission history

From: Hyeonseung Lee [view email]
[v1] Fri, 10 Jul 2020 07:35:31 GMT (370kb,D)
[v2] Thu, 23 Jul 2020 09:59:44 GMT (370kb,D)
[v3] Thu, 14 Jan 2021 09:10:27 GMT (403kb,D)

Link back to: arXiv, form interface, contact.