References & Citations
Computer Science > Computation and Language
Title: Reservoir Transformers
(Submitted on 30 Dec 2020 (v1), last revised 1 Jun 2021 (this version, v2))
Abstract: We demonstrate that transformers obtain impressive performance even when some of the layers are randomly initialized and never updated. Inspired by old and well-established ideas in machine learning, we explore a variety of non-linear "reservoir" layers interspersed with regular transformer layers, and show improvements in wall-clock compute time until convergence, as well as overall performance, on various machine translation and (masked) language modelling tasks.
Submission history
From: Sheng Shen [view email][v1] Wed, 30 Dec 2020 05:20:16 GMT (388kb,D)
[v2] Tue, 1 Jun 2021 19:32:18 GMT (914kb,D)
Link back to: arXiv, form interface, contact.