We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Computation and Language

Title: Improving Tagging Consistency and Entity Coverage for Chemical Identification in Full-text Articles

Abstract: This paper is a technical report on our system submitted to the chemical identification task of the BioCreative VII Track 2 challenge. The main feature of this challenge is that the data consists of full-text articles, while current datasets usually consist of only titles and abstracts. To effectively address the problem, we aim to improve tagging consistency and entity coverage using various methods such as majority voting within the same articles for named entity recognition (NER) and a hybrid approach that combines a dictionary and a neural model for normalization. In the experiments on the NLM-Chem dataset, we show that our methods improve models' performance, particularly in terms of recall. Finally, in the official evaluation of the challenge, our system was ranked 1st in NER by significantly outperforming the baseline model and more than 80 submissions from 16 teams.
Comments: BioCreative VII Challenge Evaluation Workshop
Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR)
Cite as: arXiv:2111.10584 [cs.CL]
  (or arXiv:2111.10584v1 [cs.CL] for this version)

Submission history

From: Hyunjae Kim [view email]
[v1] Sat, 20 Nov 2021 13:13:58 GMT (405kb)

Link back to: arXiv, form interface, contact.