LLMs and the Abstraction and Reasoning Corpus: Successes, Failures, and the Importance of Object-based Representations

Xu, Yudong; Li, Wenhao; Vaezipoor, Pashootan; Sanner, Scott; Khalil, Elias B.

Full-text links:

Download:

Current browse context:

cs.CL

< prev | next >

new | recent | 2305

Computer Science > Computation and Language

Title: LLMs and the Abstraction and Reasoning Corpus: Successes, Failures, and the Importance of Object-based Representations

Authors: Yudong Xu, Wenhao Li, Pashootan Vaezipoor, Scott Sanner, Elias B. Khalil

(Submitted on 26 May 2023 (v1), last revised 14 Feb 2024 (this version, v2))

Abstract: Can a Large Language Model (LLM) solve simple abstract reasoning problems? We explore this broad question through a systematic analysis of GPT on the Abstraction and Reasoning Corpus (ARC), a representative benchmark of abstract reasoning ability from limited examples in which solutions require some "core knowledge" of concepts such as objects, goal states, counting, and basic geometry. GPT-4 solves only 13/50 of the most straightforward ARC tasks when using textual encodings for their two-dimensional input-output grids. Our failure analysis reveals that GPT-4's capacity to identify objects and reason about them is significantly influenced by the sequential nature of the text that represents an object within a text encoding of a task. To test this hypothesis, we design a new benchmark, the 1D-ARC, which consists of one-dimensional (array-like) tasks that are more conducive to GPT-based reasoning, and where it indeed performs better than on the (2D) ARC. To alleviate this issue, we propose an object-based representation that is obtained through an external tool, resulting in nearly doubling the performance on solved ARC tasks and near-perfect scores on the easier 1D-ARC. Although the state-of-the-art GPT-4 is unable to "reason" perfectly within non-language domains such as the 1D-ARC or a simple ARC subset, our study reveals that the use of object-based representations can significantly improve its reasoning ability. Visualizations, GPT logs, and data are available at this https URL

Comments:	26 pages, 15 figures, published in Transactions on Machine Learning Research (TMLR)
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2305.18354 [cs.CL]
	(or arXiv:2305.18354v2 [cs.CL] for this version)

Submission history

From: Yudong Xu [view email]
[v1] Fri, 26 May 2023 16:32:17 GMT (15541kb,D)
[v2] Wed, 14 Feb 2024 21:15:31 GMT (9564kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2305.18354

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computation and Language

Title: LLMs and the Abstraction and Reasoning Corpus: Successes, Failures, and the Importance of Object-based Representations

Submission history