Quantifying Context Mixing in Transformers

Mohebbi, Hosein; Zuidema, Willem; Chrupała, Grzegorz; Alishahi, Afra

Full-text links:

Download:

Current browse context:

cs.CL

< prev | next >

new | recent | 2301

Computer Science > Computation and Language

Title: Quantifying Context Mixing in Transformers

Authors: Hosein Mohebbi, Willem Zuidema, Grzegorz Chrupała, Afra Alishahi

(Submitted on 30 Jan 2023 (v1), last revised 8 Feb 2023 (this version, v2))

Abstract: Self-attention weights and their transformed variants have been the main source of information for analyzing token-to-token interactions in Transformer-based models. But despite their ease of interpretation, these weights are not faithful to the models' decisions as they are only one part of an encoder, and other components in the encoder layer can have considerable impact on information mixing in the output representations. In this work, by expanding the scope of analysis to the whole encoder block, we propose Value Zeroing, a novel context mixing score customized for Transformers that provides us with a deeper understanding of how information is mixed at each encoder layer. We demonstrate the superiority of our context mixing score over other analysis methods through a series of complementary evaluations with different viewpoints based on linguistically informed rationales, probing, and faithfulness analysis.

Comments:	Accepted to EACL 2023 (main)
Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2301.12971 [cs.CL]
	(or arXiv:2301.12971v2 [cs.CL] for this version)

Submission history

From: Hosein Mohebbi [view email]
[v1] Mon, 30 Jan 2023 15:19:02 GMT (765kb,D)
[v2] Wed, 8 Feb 2023 10:46:38 GMT (878kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2301.12971

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computation and Language

Title: Quantifying Context Mixing in Transformers

Submission history