Intelligence and Unambitiousness Using Algorithmic Information Theory

Cohen, Michael K.; Vellambi, Badri; Hutter, Marcus

doi:10.1109/JSAIT.2021.3073844

Full-text links:

Download:

Current browse context:

cs.AI

< prev | next >

new | recent | 2105

Change to browse by:

Computer Science > Artificial Intelligence

Title: Intelligence and Unambitiousness Using Algorithmic Information Theory

Authors: Michael K. Cohen, Badri Vellambi, Marcus Hutter

(Submitted on 13 May 2021)

Abstract: Algorithmic Information Theory has inspired intractable constructions of general intelligence (AGI), and undiscovered tractable approximations are likely feasible. Reinforcement Learning (RL), the dominant paradigm by which an agent might learn to solve arbitrary solvable problems, gives an agent a dangerous incentive: to gain arbitrary "power" in order to intervene in the provision of their own reward. We review the arguments that generally intelligent algorithmic-information-theoretic reinforcement learners such as Hutter's (2005) AIXI would seek arbitrary power, including over us. Then, using an information-theoretic exploration schedule, and a setup inspired by causal influence theory, we present a variant of AIXI which learns to not seek arbitrary power; we call it "unambitious". We show that our agent learns to accrue reward at least as well as a human mentor, while relying on that mentor with diminishing probability. And given a formal assumption that we probe empirically, we show that eventually, the agent's world-model incorporates the following true fact: intervening in the "outside world" will have no effect on reward acquisition; hence, it has no incentive to shape the outside world.

Comments:	13 pages, 6 figures, 5-page appendix. arXiv admin note: text overlap with arXiv:1905.12186
Subjects:	Artificial Intelligence (cs.AI)
ACM classes:	I.2.0; I.2.6
Journal reference:	Journal of Selected Areas in Information Theory 2 (2021)
DOI:	10.1109/JSAIT.2021.3073844
Cite as:	arXiv:2105.06268 [cs.AI]
	(or arXiv:2105.06268v1 [cs.AI] for this version)

Submission history

From: Michael Cohen [view email]
[v1] Thu, 13 May 2021 13:10:28 GMT (397kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2105.06268

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Artificial Intelligence

Title: Intelligence and Unambitiousness Using Algorithmic Information Theory

Submission history