We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:


Current browse context:


Change to browse by:


References & Citations

DBLP - CS Bibliography


(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Computation and Language

Title: CC-Riddle: A Question Answering Dataset of Chinese Character Riddles

Abstract: Chinese character riddle is a challenging riddle game which takes a single character as the solution. The riddle describes the pronunciation, shape and meaning of the solution character with rhetoric techniques. In this paper, we propose a Chinese character riddle dataset covering the majority of common simplified Chinese characters by crawling riddles from the Web and generating brand new ones. In the generation stage, we provide the Chinese phonetic alphabet, decomposition and explanation of the solution character for the generation model and get multiple riddle descriptions for each tested character. Then the generated riddles are manually filtered and the final dataset, CC-Riddle is composed of both human-written riddles and filtered generated riddles. Furthermore, we build a character riddle QA system based on our dataset and find that the existing models struggle to solve such tricky questions. CC-Riddle is now publicly available.
Comments: 10 pages, 8 figures, 7 tables
Subjects: Computation and Language (cs.CL)
ACM classes: I.2.7
Cite as: arXiv:2206.13778 [cs.CL]
  (or arXiv:2206.13778v1 [cs.CL] for this version)

Submission history

From: Fan Xu [view email]
[v1] Tue, 28 Jun 2022 06:23:13 GMT (472kb,D)

Link back to: arXiv, form interface, contact.