We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: Allophant: Cross-lingual Phoneme Recognition with Articulatory Attributes

Authors: Kevin Glocker (1), Aaricia Herygers (1), Munir Georges (1 and 2) ((1) AImotion Bavaria Technische Hochschule Ingolstadt, (2) Intel Labs Germany)
Abstract: This paper proposes Allophant, a multilingual phoneme recognizer. It requires only a phoneme inventory for cross-lingual transfer to a target language, allowing for low-resource recognition. The architecture combines a compositional phone embedding approach with individually supervised phonetic attribute classifiers in a multi-task architecture. We also introduce Allophoible, an extension of the PHOIBLE database. When combined with a distance based mapping approach for grapheme-to-phoneme outputs, it allows us to train on PHOIBLE inventories directly. By training and evaluating on 34 languages, we found that the addition of multi-task learning improves the model's capability of being applied to unseen phonemes and phoneme inventories. On supervised languages we achieve phoneme error rate improvements of 11 percentage points (pp.) compared to a baseline without multi-task learning. Evaluation of zero-shot transfer on 84 languages yielded a decrease in PER of 2.63 pp. over the baseline.
Comments: 5 pages, 2 figures, 2 tables, accepted to INTERSPEECH 2023; published version
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
ACM classes: I.2.7
Journal reference: Proc. INTERSPEECH 2023, 2258-2262
DOI: 10.21437/Interspeech.2023-772
Cite as: arXiv:2306.04306 [cs.CL]
  (or arXiv:2306.04306v2 [cs.CL] for this version)

Submission history

From: Kevin Glocker [view email]
[v1] Wed, 7 Jun 2023 10:11:09 GMT (55kb,D)
[v2] Wed, 16 Aug 2023 17:44:59 GMT (56kb,D)

Link back to: arXiv, form interface, contact.