References & Citations
Computer Science > Computation and Language
Title: Atypical lexical abbreviations identification in Russian medical texts
(Submitted on 4 Jun 2022)
Abstract: Abbreviation is a method of word formation that aims to construct the shortened term from the first letters of the initial phrase. Implicit abbreviations frequently cause the comprehension difficulties for unprepared readers. In this paper, we propose an efficient ML-based algorithm which allows to identify the abbreviations in Russian texts. The method achieves ROC AUC score 0.926 and F1 score 0.706 which are confirmed as competitive in comparison with the baselines. Along with the pipeline, we also establish first to our knowledge Russian dataset that is relevant for the desired task.
Submission history
From: Anna Berdichevskaia [view email][v1] Sat, 4 Jun 2022 13:16:08 GMT (1074kb,D)
Link back to: arXiv, form interface, contact.