We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: Data-Informed Global Sparseness in Attention Mechanisms for Deep Neural Networks

Abstract: The attention mechanism is a key component of the neural revolution in Natural Language Processing (NLP). As the size of attention-based models has been scaling with the available computational resources, a number of pruning techniques have been developed to detect and to exploit sparseness in such models in order to make them more efficient. The majority of such efforts have focused on looking for attention patterns and then hard-coding them to achieve sparseness, or pruning the weights of the attention mechanisms based on statistical information from the training data. Here, we marry these two lines of research by proposing Attention Pruning (AP): a novel pruning framework that collects observations about the attention patterns in a fixed dataset and then induces a global sparseness mask for the model. This can save 90% of the attention computation for language modelling and about 50% for machine translation and for solving GLUE tasks, while maintaining the quality of the results. Moreover, using our method, we discovered important distinctions between self- and cross-attention patterns, which could guide future NLP research in attention-based modelling. Our framework can in principle speed up any model that uses attention mechanism, thus helping develop better models for existing or for new NLP applications. Our implementation is available at this https URL
Comments: 13 pages, 6 figures, 10 tables
Subjects: Computation and Language (cs.CL)
Cite as: arXiv:2012.02030 [cs.CL]
  (or arXiv:2012.02030v2 [cs.CL] for this version)

Submission history

From: Rumen Dangovski [view email]
[v1] Fri, 20 Nov 2020 13:58:21 GMT (7993kb,D)
[v2] Sat, 8 May 2021 23:24:17 GMT (993kb,D)

Link back to: arXiv, form interface, contact.