We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: Exploring Continuous Integrate-and-Fire for Adaptive Simultaneous Speech Translation

Abstract: Simultaneous speech translation (SimulST) is a challenging task aiming to translate streaming speech before the complete input is observed. A SimulST system generally includes two components: the pre-decision that aggregates the speech information and the policy that decides to read or write. While recent works had proposed various strategies to improve the pre-decision, they mainly adopt the fixed wait-k policy, leaving the adaptive policies rarely explored. This paper proposes to model the adaptive policy by adapting the Continuous Integrate-and-Fire (CIF). Compared with monotonic multihead attention (MMA), our method has the advantage of simpler computation, superior quality at low latency, and better generalization to long utterances. We conduct experiments on the MuST-C V2 dataset and show the effectiveness of our approach.
Comments: INTERSPEECH 2022 camera ready
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
Journal reference: Proc. Interspeech 2022, 5175-5179
DOI: 10.21437/Interspeech.2022-10627
Cite as: arXiv:2204.09595 [cs.CL]
  (or arXiv:2204.09595v3 [cs.CL] for this version)

Submission history

From: Chih-Chiang Chang [view email]
[v1] Tue, 22 Mar 2022 23:33:18 GMT (186kb,D)
[v2] Thu, 21 Apr 2022 03:54:21 GMT (187kb,D)
[v3] Mon, 3 Oct 2022 23:57:05 GMT (204kb,D)

Link back to: arXiv, form interface, contact.