We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CC

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computational Complexity

Title: Log-Precision Transformers are Constant-Depth Uniform Threshold Circuits

Abstract: We prove that transformer neural networks with logarithmic precision in the input length (and where the feedforward subnetworks are computable using linear space in their input length) can be simulated by constant-depth uniform threshold circuits. Thus, such transformers only recognize formal languages in $\mathsf{TC}^0$, the class of languages defined by constant-depth, poly-size threshold circuits. This demonstrates a connection between a practical claim in NLP and a theoretical conjecture in computational complexity theory: "attention is all you need" (Vaswani et al., 2017), i.e., transformers are capable of all efficient computation, only if all efficiently computable problems can be solved with log space, i.e., $\mathsf L = \mathsf P$. We also construct a transformer that can evaluate any constant-depth threshold circuit on any input, proving that transformers can follow instructions that are representable in $\mathsf{TC}^0$.
Comments: Preprint
Subjects: Computational Complexity (cs.CC); Computation and Language (cs.CL)
Cite as: arXiv:2207.00729 [cs.CC]
  (or arXiv:2207.00729v1 [cs.CC] for this version)

Submission history

From: William Merrill [view email]
[v1] Sat, 2 Jul 2022 03:49:34 GMT (51kb)
[v2] Tue, 21 Feb 2023 05:45:52 GMT (369kb,D)
[v3] Tue, 7 Mar 2023 23:07:38 GMT (369kb,D)
[v4] Wed, 26 Apr 2023 22:34:12 GMT (369kb,D)

Link back to: arXiv, form interface, contact.