Automatic Generation of Programming Exercises and Code Explanations with Large Language Models

Sarsa, Sami; Denny, Paul; Hellas, Arto; Leinonen, Juho

Full-text links:

Download:

Current browse context:

cs.SE

< prev | next >

new | recent | 2206

Computer Science > Software Engineering

Title: Automatic Generation of Programming Exercises and Code Explanations with Large Language Models

Authors: Sami Sarsa, Paul Denny, Arto Hellas, Juho Leinonen

(Submitted on 3 Jun 2022 (this version), latest version 26 Jun 2022 (v2))

Abstract: OpenAI Codex is a recent large language model from the GPT-3 family for translating code into natural language and vice versa. Recent explorations of Codex have highlighted that given typical introductory programming exercise problem statements as input, the model can generate code solutions well above the level of an average student. In this article, we explore the natural language generation capabilities of Codex in two different phases of the life of a programming exercise; automatically creating programming exercises (including sample solutions and test cases) and explanations of written code, assessing these qualitatively and quantitatively. We find the majority of this automatically generated content both novel and sensible, and in many cases ready to use as is. We further find that influencing the content of the created programming exercises is remarkably easy with minor modifications to the input. Our analysis suggests that there is significant value in massive generative machine learning models as a tool for instructors, although some oversight might be needed to ensure the quality of the generated content before it is delivered to students. We further discuss the implications of OpenAI Codex and similar tools for introductory programming education and highlight future research streams that have the potential to improve the quality of the educational experience for both teachers and students alike.

Comments:	19 pages, 1 figure, accepted in ICER
Subjects:	Software Engineering (cs.SE); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Cite as:	arXiv:2206.11861 [cs.SE]
	(or arXiv:2206.11861v1 [cs.SE] for this version)

Submission history

From: Sami Sarsa [view email]
[v1] Fri, 3 Jun 2022 11:00:43 GMT (107kb,D)
[v2] Sun, 26 Jun 2022 12:19:46 GMT (138kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2206.11861v1

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Software Engineering

Title: Automatic Generation of Programming Exercises and Code Explanations with Large Language Models

Submission history