We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: Inserting Information Bottlenecks for Attribution in Transformers

Abstract: Pretrained transformers achieve the state of the art across tasks in natural language processing, motivating researchers to investigate their inner mechanisms. One common direction is to understand what features are important for prediction. In this paper, we apply information bottlenecks to analyze the attribution of each feature for prediction on a black-box model. We use BERT as the example and evaluate our approach both quantitatively and qualitatively. We show the effectiveness of our method in terms of attribution and the ability to provide insight into how information flows through layers. We demonstrate that our technique outperforms two competitive methods in degradation tests on four datasets. Code is available at this https URL
Comments: refine formula
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)
Journal reference: In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings (pp. 3850-3857)
Cite as: arXiv:2012.13838 [cs.CL]
  (or arXiv:2012.13838v2 [cs.CL] for this version)

Submission history

From: Zhiying Jiang [view email]
[v1] Sun, 27 Dec 2020 00:35:43 GMT (117kb,D)
[v2] Thu, 5 Aug 2021 02:19:44 GMT (252kb,D)

Link back to: arXiv, form interface, contact.