References & Citations
Computer Science > Computational Complexity
Title: Log-Precision Transformers are Constant-Depth Uniform Threshold Circuits
(Submitted on 2 Jul 2022 (this version), latest version 26 Apr 2023 (v4))
Abstract: We prove that transformer neural networks with logarithmic precision in the input length (and where the feedforward subnetworks are computable using linear space in their input length) can be simulated by constant-depth uniform threshold circuits. Thus, such transformers only recognize formal languages in $\mathsf{TC}^0$, the class of languages defined by constant-depth, poly-size threshold circuits. This demonstrates a connection between a practical claim in NLP and a theoretical conjecture in computational complexity theory: "attention is all you need" (Vaswani et al., 2017), i.e., transformers are capable of all efficient computation, only if all efficiently computable problems can be solved with log space, i.e., $\mathsf L = \mathsf P$. We also construct a transformer that can evaluate any constant-depth threshold circuit on any input, proving that transformers can follow instructions that are representable in $\mathsf{TC}^0$.
Submission history
From: William Merrill [view email][v1] Sat, 2 Jul 2022 03:49:34 GMT (51kb)
[v2] Tue, 21 Feb 2023 05:45:52 GMT (369kb,D)
[v3] Tue, 7 Mar 2023 23:07:38 GMT (369kb,D)
[v4] Wed, 26 Apr 2023 22:34:12 GMT (369kb,D)
Link back to: arXiv, form interface, contact.