Inverse Classification for Comparison-based Interpretability in Machine Learning

Laugel, Thibault; Lesot, Marie-Jeanne; Marsala, Christophe; Renard, Xavier; Detyniecki, Marcin

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 1712

Statistics > Machine Learning

Title: Inverse Classification for Comparison-based Interpretability in Machine Learning

Authors: Thibault Laugel, Marie-Jeanne Lesot, Christophe Marsala, Xavier Renard, Marcin Detyniecki

(Submitted on 22 Dec 2017)

Abstract: In the context of post-hoc interpretability, this paper addresses the task of explaining the prediction of a classifier, considering the case where no information is available, neither on the classifier itself, nor on the processed data (neither the training nor the test data). It proposes an instance-based approach whose principle consists in determining the minimal changes needed to alter a prediction: given a data point whose classification must be explained, the proposed method consists in identifying a close neighbour classified differently, where the closeness definition integrates a sparsity constraint. This principle is implemented using observation generation in the Growing Spheres algorithm. Experimental results on two datasets illustrate the relevance of the proposed approach that can be used to gain knowledge about the classifier.

Comments:	preprint
Subjects:	Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:1712.08443 [stat.ML]
	(or arXiv:1712.08443v1 [stat.ML] for this version)

Submission history

From: Thibault Laugel [view email]
[v1] Fri, 22 Dec 2017 13:51:21 GMT (110kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> stat > arXiv:1712.08443

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Statistics > Machine Learning

Title: Inverse Classification for Comparison-based Interpretability in Machine Learning

Submission history