We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.SE

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Software Engineering

Title: Lyra: A Benchmark for Turducken-Style Code Generation

Abstract: Code generation is crucial to reduce manual software development efforts. Recently, neural techniques have been used to generate source code automatically. While promising, these approaches are evaluated on tasks for generating code in single programming languages. However, in actual development, one programming language is often embedded in another. For example, SQL statements are often embedded as strings in base programming languages such as Python and Java, and JavaScript programs are often embedded in sever-side programming languages, such as PHP, Java, and Python. We call this a turducken-style programming. In this paper, we define a new code generation task: given a natural language comment, this task aims to generate a program in a base language with an embedded language. To our knowledge, this is the first turducken-style code generation task. For this task, we present Lyra: a dataset in Python with embedded SQL. This dataset contains 2,000 carefully annotated database manipulation programs from real usage projects. Each program is paired with both a Chinese comment and an English comment. In our experiment, we adopted Transformer, a state-of-the-art technique, as the baseline. In the best setting, Transformer achieves 0.5% and 1.5% AST exact matching accuracy using Chinese and English comments, respectively. Therefore, we believe that Lyra provides a new challenge for code generation.
Comments: 9 pages, 4 figures
Subjects: Software Engineering (cs.SE); Artificial Intelligence (cs.AI)
Cite as: arXiv:2108.12144 [cs.SE]
  (or arXiv:2108.12144v1 [cs.SE] for this version)

Submission history

From: Qingyuan Liang [view email]
[v1] Fri, 27 Aug 2021 07:22:55 GMT (1288kb,D)
[v2] Wed, 4 May 2022 15:59:44 GMT (2539kb,D)
[v3] Sun, 24 Jul 2022 04:54:17 GMT (3226kb,D)

Link back to: arXiv, form interface, contact.