Towards No.1 in CLUE Semantic Matching Challenge: Pre-trained Language Model Erlangshen with Propensity-Corrected Loss

Wang, Junjie; Zhang, Yuxiang; Yang, Ping; Gan, Ruyi

Full-text links:

Download:

Current browse context:

cs.CL

< prev | next >

new | recent | 2208

Change to browse by:

Computer Science > Computation and Language

Title: Towards No.1 in CLUE Semantic Matching Challenge: Pre-trained Language Model Erlangshen with Propensity-Corrected Loss

Authors: Junjie Wang, Yuxiang Zhang, Ping Yang, Ruyi Gan

(Submitted on 5 Aug 2022)

Abstract: This report describes a pre-trained language model Erlangshen with propensity-corrected loss, the No.1 in CLUE Semantic Matching Challenge. In the pre-training stage, we construct a dynamic masking strategy based on knowledge in Masked Language Modeling (MLM) with whole word masking. Furthermore, by observing the specific structure of the dataset, the pre-trained Erlangshen applies propensity-corrected loss (PCL) in the fine-tuning phase. Overall, we achieve 72.54 points in F1 Score and 78.90 points in Accuracy on the test set. Our code is publicly available at: this https URL

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2208.02959 [cs.CL]
	(or arXiv:2208.02959v1 [cs.CL] for this version)

Submission history

From: Junjie Wang [view email]
[v1] Fri, 5 Aug 2022 02:52:29 GMT (6728kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2208.02959

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computation and Language

Title: Towards No.1 in CLUE Semantic Matching Challenge: Pre-trained Language Model Erlangshen with Propensity-Corrected Loss

Submission history