Explainability Via Causal Self-Talk

Roy, Nicholas A.; Kim, Junkyung; Rabinowitz, Neil

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 2211

Computer Science > Artificial Intelligence

Title: Explainability Via Causal Self-Talk

Authors: Nicholas A. Roy, Junkyung Kim, Neil Rabinowitz

(Submitted on 17 Nov 2022)

Abstract: Explaining the behavior of AI systems is an important problem that, in practice, is generally avoided. While the XAI community has been developing an abundance of techniques, most incur a set of costs that the wider deep learning community has been unwilling to pay in most situations. We take a pragmatic view of the issue, and define a set of desiderata that capture both the ambitions of XAI and the practical constraints of deep learning. We describe an effective way to satisfy all the desiderata: train the AI system to build a causal model of itself. We develop an instance of this solution for Deep RL agents: Causal Self-Talk. CST operates by training the agent to communicate with itself across time. We implement this method in a simulated 3D environment, and show how it enables agents to generate faithful and semantically-meaningful explanations of their own behavior. Beyond explanations, we also demonstrate that these learned models provide new ways of building semantic control interfaces to AI systems.

Subjects:	Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2211.09937 [cs.AI]
	(or arXiv:2211.09937v1 [cs.AI] for this version)

Submission history

From: Nicholas Roy [view email]
[v1] Thu, 17 Nov 2022 23:17:01 GMT (3432kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2211.09937

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Artificial Intelligence

Title: Explainability Via Causal Self-Talk

Submission history