We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.LG

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Machine Learning

Title: Programming Puzzles

Abstract: We introduce a new type of programming challenge called programming puzzles, as an objective and comprehensive evaluation of program synthesis, and release an open-source dataset of Python Programming Puzzles (P3). Each puzzle is defined by a short Python program $f$, and the goal is to find an input which makes $f$ return True. The puzzles are objective in that each one is specified entirely by the source code of its verifier $f$, so evaluating $f$ is all that is needed to test a candidate solution. They do not require an answer key or input/output examples, nor do they depend on natural language understanding. The dataset is comprehensive in that it spans problems of a range of difficulties and domains, ranging from trivial string manipulation problems, to classic programming puzzles (e.g., Tower of Hanoi), to interview/competitive-programming problems (e.g., dynamic programming), to longstanding open problems in algorithms and mathematics (e.g., factoring). We develop baseline enumerative program synthesis, GPT-3 and Codex solvers that are capable of solving puzzles -- even without access to any reference solutions -- by learning from their own past solutions. Codex performs best, solving up to 18% of 397 test problems with a single try and 80% of the problems with 1,000 tries per problem. In a small user study, we find a positive correlation between puzzle-solving performance and coding experience, and between the puzzle difficulty for humans and AI solvers. Therefore, further improvements on P3 could have a significant impact on many program synthesis areas.
Comments: NeurIPS 2021 (Datasets and Benchmarks Track). Puzzles repository: this https URL
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Programming Languages (cs.PL); Software Engineering (cs.SE)
Cite as: arXiv:2106.05784 [cs.LG]
  (or arXiv:2106.05784v3 [cs.LG] for this version)

Submission history

From: Tal Schuster [view email]
[v1] Thu, 10 Jun 2021 14:37:28 GMT (2751kb,D)
[v2] Mon, 20 Sep 2021 03:41:06 GMT (2952kb,D)
[v3] Sat, 6 Nov 2021 20:53:50 GMT (3178kb,D)

Link back to: arXiv, form interface, contact.