References & Citations
Computer Science > Computation and Language
Title: Findings of the The RuATD Shared Task 2022 on Artificial Text Detection in Russian
(Submitted on 3 Jun 2022)
Abstract: We present the shared task on artificial text detection in Russian, which is organized as a part of the Dialogue Evaluation initiative, held in 2022. The shared task dataset includes texts from 14 text generators, i.e., one human writer and 13 text generative models fine-tuned for one or more of the following generation tasks: machine translation, paraphrase generation, text summarization, text simplification. We also consider back-translation and zero-shot generation approaches. The human-written texts are collected from publicly available resources across multiple domains. The shared task consists of two sub-tasks: (i) to determine if a given text is automatically generated or written by a human; (ii) to identify the author of a given text. The first task is framed as a binary classification problem. The second task is a multi-class classification problem. We provide count-based and BERT-based baselines, along with the human evaluation on the first sub-task. A total of 30 and 8 systems have been submitted to the binary and multi-class sub-tasks, correspondingly. Most teams outperform the baselines by a wide margin. We publicly release our codebase, human evaluation results, and other materials in our GitHub repository (this https URL).
Submission history
From: Ekaterina Artemova [view email][v1] Fri, 3 Jun 2022 14:12:33 GMT (7161kb,D)
Link back to: arXiv, form interface, contact.