TALM: Tool Augmented Language Models

Parisi, Aaron; Zhao, Yao; Fiedel, Noah

Full-text links:

Download:

Current browse context:

cs.CL

< prev | next >

new | recent | 2205

Computer Science > Computation and Language

Title: TALM: Tool Augmented Language Models

Authors: Aaron Parisi, Yao Zhao, Noah Fiedel

(Submitted on 24 May 2022)

Abstract: Transformer based language models (LMs) demonstrate increasing performance with scale across a wide variety of tasks. Scale alone however cannot enable models to solve tasks that require access to ephemeral, changing, or private data that was unavailable at training time. Many useful tasks may also benefit from LMs being able to access APIs that read or modify state. In this work, we present Tool Augmented Language Models (TALM), combining a text-only approach to augment language models with non-differentiable tools, and an iterative "self-play" technique to bootstrap performance starting from few tool demonstrations. TALM exhibits strong performance on both a knowledge-heavy QA task and a reasoning oriented math task with simple tools. At a given model scale, TALM significantly outperforms non-augmented LMs. We further demonstrate that TALM successfully performs out-of-distribution inferences on both QA and math tasks, where non-augmented LMs fail. Our results suggest that Tool Augmented Language Models are a promising direction to enrich LMs' capabilities, with less dependence on scale.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2205.12255 [cs.CL]
	(or arXiv:2205.12255v1 [cs.CL] for this version)

Submission history

From: Yao Zhao [view email]
[v1] Tue, 24 May 2022 17:58:13 GMT (763kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2205.12255

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computation and Language

Title: TALM: Tool Augmented Language Models

Submission history