On the Ability and Limitations of Transformers to Recognize Formal Languages

Bhattamishra, Satwik; Ahuja, Kabir; Goyal, Navin

Full-text links:

Download:

Current browse context:

cs.CL

< prev | next >

new | recent | 2009

Computer Science > Computation and Language

Title: On the Ability and Limitations of Transformers to Recognize Formal Languages

Authors: Satwik Bhattamishra, Kabir Ahuja, Navin Goyal

(Submitted on 23 Sep 2020 (v1), last revised 8 Oct 2020 (this version, v2))

Abstract: Transformers have supplanted recurrent models in a large number of NLP tasks. However, the differences in their abilities to model different syntactic properties remain largely unknown. Past works suggest that LSTMs generalize very well on regular languages and have close connections with counter languages. In this work, we systematically study the ability of Transformers to model such languages as well as the role of its individual components in doing so. We first provide a construction of Transformers for a subclass of counter languages, including well-studied languages such as n-ary Boolean Expressions, Dyck-1, and its generalizations. In experiments, we find that Transformers do well on this subclass, and their learned mechanism strongly correlates with our construction. Perhaps surprisingly, in contrast to LSTMs, Transformers do well only on a subset of regular languages with degrading performance as we make languages more complex according to a well-known measure of complexity. Our analysis also provides insights on the role of self-attention mechanism in modeling certain behaviors and the influence of positional encoding schemes on the learning and generalization abilities of the model.

Comments:	EMNLP 2020
Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2009.11264 [cs.CL]
	(or arXiv:2009.11264v2 [cs.CL] for this version)

Submission history

From: Satwik Bhattamishra [view email]
[v1] Wed, 23 Sep 2020 17:21:33 GMT (479kb,D)
[v2] Thu, 8 Oct 2020 12:55:37 GMT (183kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2009.11264

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computation and Language

Title: On the Ability and Limitations of Transformers to Recognize Formal Languages

Submission history