Low Resource Pipeline for Spoken Language Understanding via Weak Supervision

Kumar, Ayush; Tripathi, Rishabh Kumar; Vepa, Jithendra

Full-text links:

Download:

Current browse context:

cs.CL

< prev | next >

new | recent | 2206

Change to browse by:

Computer Science > Computation and Language

Title: Low Resource Pipeline for Spoken Language Understanding via Weak Supervision

Authors: Ayush Kumar, Rishabh Kumar Tripathi, Jithendra Vepa

(Submitted on 21 Jun 2022)

Abstract: In Weak Supervised Learning (WSL), a model is trained over noisy labels obtained from semantic rules and task-specific pre-trained models. Rules offer limited generalization over tasks and require significant manual efforts while pre-trained models are available only for limited tasks. In this work, we propose to utilize prompt-based methods as weak sources to obtain the noisy labels on unannotated data. We show that task-agnostic prompts are generalizable and can be used to obtain noisy labels for different Spoken Language Understanding (SLU) tasks such as sentiment classification, disfluency detection and emotion classification. These prompts could additionally be updated to add task-specific contexts, thus providing flexibility to design task-specific prompts. We demonstrate that prompt-based methods generate reliable labels for the above SLU tasks and thus can be used as a universal weak source to train a weak-supervised model (WSM) in absence of labeled data. Our proposed WSL pipeline trained over prompt-based weak source outperforms other competitive low-resource benchmarks on zero and few-shot learning by more than 4% on Macro-F1 on all of the three benchmark SLU datasets. The proposed method also outperforms a conventional rule based WSL pipeline by more than 5% on Macro-F1.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2206.10559 [cs.CL]
	(or arXiv:2206.10559v1 [cs.CL] for this version)

Submission history

From: Ayush Kumar [view email]
[v1] Tue, 21 Jun 2022 17:36:31 GMT (721kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2206.10559

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computation and Language

Title: Low Resource Pipeline for Spoken Language Understanding via Weak Supervision

Submission history