What Makes Pre-trained Language Models Better Zero-shot Learners?

Lu, Jinghui; Zhu, Dongsheng; Han, Weidong; Zhao, Rui; Mac Namee, Brian; Tan, Fei

Full-text links:

Download:

Current browse context:

cs.CL

< prev | next >

new | recent | 2209

Computer Science > Computation and Language

Title: What Makes Pre-trained Language Models Better Zero-shot Learners?

Authors: Jinghui Lu, Dongsheng Zhu, Weidong Han, Rui Zhao, Brian Mac Namee, Fei Tan

(Submitted on 30 Sep 2022 (v1), last revised 16 May 2023 (this version, v3))

Abstract: Current methods for prompt learning in zeroshot scenarios widely rely on a development set with sufficient human-annotated data to select the best-performing prompt template a posteriori. This is not ideal because in a realworld zero-shot scenario of practical relevance, no labelled data is available. Thus, we propose a simple yet effective method for screening reasonable prompt templates in zero-shot text classification: Perplexity Selection (Perplection). We hypothesize that language discrepancy can be used to measure the efficacy of prompt templates, and thereby develop a substantiated perplexity-based scheme allowing for forecasting the performance of prompt templates in advance. Experiments show that our method leads to improved prediction performance in a realistic zero-shot setting, eliminating the need for any labelled examples.

Comments:	Accepted to ACL2023 main conference
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2209.15206 [cs.CL]
	(or arXiv:2209.15206v3 [cs.CL] for this version)

Submission history

From: Jinghui Lu [view email]
[v1] Fri, 30 Sep 2022 03:28:19 GMT (10831kb,D)
[v2] Sun, 14 May 2023 08:19:28 GMT (8244kb,D)
[v3] Tue, 16 May 2023 02:45:51 GMT (8244kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2209.15206

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computation and Language

Title: What Makes Pre-trained Language Models Better Zero-shot Learners?

Submission history