Evaluating Explanations: How much do explanations from the teacher aid students?

Pruthi, Danish; Dhingra, Bhuwan; Soares, Livio Baldini; Collins, Michael; Lipton, Zachary C.; Neubig, Graham; Cohen, William W.

Full-text links:

Download:

Current browse context:

cs.CL

< prev | next >

new | recent | 2012

Computer Science > Computation and Language

Title: Evaluating Explanations: How much do explanations from the teacher aid students?

Authors: Danish Pruthi, Bhuwan Dhingra, Livio Baldini Soares, Michael Collins, Zachary C. Lipton, Graham Neubig, William W. Cohen

(Submitted on 1 Dec 2020 (this version), latest version 17 Dec 2021 (v2))

Abstract: While many methods purport to explain predictions by highlighting salient features, what precise aims these explanations serve and how to evaluate their utility are often unstated. In this work, we formalize the value of explanations using a student-teacher paradigm that measures the extent to which explanations improve student models in learning to simulate the teacher model on unseen examples for which explanations are unavailable. Student models incorporate explanations in training (but not prediction) procedures. Unlike many prior proposals to evaluate explanations, our approach cannot be easily gamed, enabling principled, scalable, and automatic evaluation of attributions. Using our framework, we compare multiple attribution methods and observe consistent and quantitative differences amongst them across multiple learning strategies.

Comments:	Preprint
Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2012.00893 [cs.CL]
	(or arXiv:2012.00893v1 [cs.CL] for this version)

Submission history

From: Danish Pruthi [view email]
[v1] Tue, 1 Dec 2020 23:40:21 GMT (633kb,D)
[v2] Fri, 17 Dec 2021 04:50:55 GMT (494kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2012.00893v1

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computation and Language

Title: Evaluating Explanations: How much do explanations from the teacher aid students?

Submission history