Injecting Text and Cross-lingual Supervision in Few-shot Learning from Self-Supervised Models

Wiesner, Matthew; Raj, Desh; Khudanpur, Sanjeev

Full-text links:

Download:

Current browse context:

eess.AS

< prev | next >

new | recent | 2110

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: Injecting Text and Cross-lingual Supervision in Few-shot Learning from Self-Supervised Models

Authors: Matthew Wiesner, Desh Raj, Sanjeev Khudanpur

(Submitted on 10 Oct 2021)

Abstract: Self-supervised model pre-training has recently garnered significant interest, but relatively few efforts have explored using additional resources in fine-tuning these models. We demonstrate how universal phoneset acoustic models can leverage cross-lingual supervision to improve transfer of pretrained self-supervised representations to new languages. We also show how target-language text can be used to enable and improve fine-tuning with the lattice-free maximum mutual information (LF-MMI) objective. In three low-resource languages these techniques greatly improved few-shot learning performance.

Comments:	\c{opyright} 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
Subjects:	Audio and Speech Processing (eess.AS); Computation and Language (cs.CL)
Cite as:	arXiv:2110.04863 [eess.AS]
	(or arXiv:2110.04863v1 [eess.AS] for this version)

Submission history

From: Matthew Wiesner [view email]
[v1] Sun, 10 Oct 2021 17:33:44 GMT (163kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> eess > arXiv:2110.04863

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: Injecting Text and Cross-lingual Supervision in Few-shot Learning from Self-Supervised Models

Submission history