Leveraging Domain Agnostic and Specific Knowledge for Acronym Disambiguation

Zhong, Qiwei; Zeng, Guanxiong; Zhu, Danqing; Zhang, Yang; Lin, Wangli; Chen, Ben; Tang, Jiayu

Full-text links:

Download:

Current browse context:

cs.AI

< prev | next >

new | recent | 2107

Change to browse by:

Computer Science > Artificial Intelligence

Title: Leveraging Domain Agnostic and Specific Knowledge for Acronym Disambiguation

Authors: Qiwei Zhong, Guanxiong Zeng, Danqing Zhu, Yang Zhang, Wangli Lin, Ben Chen, Jiayu Tang

(Submitted on 1 Jul 2021)

Abstract: An obstacle to scientific document understanding is the extensive use of acronyms which are shortened forms of long technical phrases. Acronym disambiguation aims to find the correct meaning of an ambiguous acronym in a given text. Recent efforts attempted to incorporate word embeddings and deep learning architectures, and achieved significant effects in this task. In general domains, kinds of fine-grained pretrained language models have sprung up, thanks to the largescale corpora which can usually be obtained through crowdsourcing. However, these models based on domain agnostic knowledge might achieve insufficient performance when directly applied to the scientific domain. Moreover, obtaining large-scale high-quality annotated data and representing high-level semantics in the scientific domain is challenging and expensive. In this paper, we consider both the domain agnostic and specific knowledge, and propose a Hierarchical Dual-path BERT method coined hdBERT to capture the general fine-grained and high-level specific representations for acronym disambiguation. First, the context-based pretrained models, RoBERTa and SciBERT, are elaborately involved in encoding these two kinds of knowledge respectively. Second, multiple layer perceptron is devised to integrate the dualpath representations simultaneously and outputs the prediction. With a widely adopted SciAD dataset contained 62,441 sentences, we investigate the effectiveness of hdBERT. The experimental results exhibit that the proposed approach outperforms state-of-the-art methods among various evaluation metrics. Specifically, its macro F1 achieves 93.73%.

Comments:	Second Place Solution, Accepted to SDU@AAAI-21
Subjects:	Artificial Intelligence (cs.AI)
Cite as:	arXiv:2107.00316 [cs.AI]
	(or arXiv:2107.00316v1 [cs.AI] for this version)

Submission history

From: Danqing Zhu [view email]
[v1] Thu, 1 Jul 2021 09:10:00 GMT (379kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2107.00316

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Artificial Intelligence

Title: Leveraging Domain Agnostic and Specific Knowledge for Acronym Disambiguation

Submission history