Current browse context:
cs.LG
Change to browse by:
References & Citations
Computer Science > Machine Learning
Title: IS Attention All What You Need? -- An Empirical Investigation on Convolution-Based Active Memory and Self-Attention
(Submitted on 27 Dec 2019 (this version), latest version 30 Dec 2019 (v2))
Abstract: The key to a Transformer model is the self-attention mechanism, which allows the model to analyze an entire sequence in a computationally efficient manner. Recent work has suggested the possibility that general attention mechanisms used by RNNs could be replaced by active-memory mechanisms. In this work, we evaluate whether various active-memory mechanisms could replace self-attention in a Transformer. Our experiments suggest that active-memory alone achieves comparable results to the self-attention mechanism for language modelling, but optimal results are mostly achieved by using both active-memory and self-attention mechanisms together. We also note that, for some specific algorithmic tasks, active-memory mechanisms alone outperform both self-attention and a combination of the two.
Submission history
From: Thomas Dowdell BCom(Hons) [view email][v1] Fri, 27 Dec 2019 02:01:13 GMT (339kb,D)
[v2] Mon, 30 Dec 2019 09:01:18 GMT (339kb,D)
Link back to: arXiv, form interface, contact.