We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition

Authors: Linhao Dong, Bo Xu
Abstract: In this paper, we propose a novel soft and monotonic alignment mechanism used for sequence transduction. It is inspired by the integrate-and-fire model in spiking neural networks and employed in the encoder-decoder framework consists of continuous functions, thus being named as: Continuous Integrate-and-Fire (CIF). Applied to the ASR task, CIF not only shows a concise calculation, but also supports online recognition and acoustic boundary positioning, thus suitable for various ASR scenarios. Several support strategies are also proposed to alleviate the unique problems of CIF-based model. With the joint action of these methods, the CIF-based model shows competitive performance. Notably, it achieves a word error rate (WER) of 2.86% on the test-clean of Librispeech and creates new state-of-the-art result on Mandarin telephone ASR benchmark.
Comments: To appear at ICASSP 2020
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Sound (cs.SD); Audio and Speech Processing (eess.AS)
Cite as: arXiv:1905.11235 [cs.CL]
  (or arXiv:1905.11235v4 [cs.CL] for this version)

Submission history

From: Linhao Dong [view email]
[v1] Mon, 27 May 2019 14:00:45 GMT (1718kb,D)
[v2] Wed, 7 Aug 2019 15:33:54 GMT (1127kb,D)
[v3] Sun, 10 Nov 2019 04:47:02 GMT (425kb,D)
[v4] Wed, 12 Feb 2020 11:13:58 GMT (425kb,D)

Link back to: arXiv, form interface, contact.