Current browse context:
math.OC
Change to browse by:
References & Citations
Mathematics > Optimization and Control
Title: Transfer-Entropy-Regularized Markov Decision Processes
(Submitted on 30 Aug 2017 (v1), revised 30 Jun 2018 (this version, v2), latest version 27 May 2020 (v3))
Abstract: We consider the framework of transfer-entropy-regularized Markov Decision Process (TERMDP) in which the weighted sum of the classical state-dependent cost and the transfer entropy from the state random process to the control random process is minimized. Although TERMDP is generally a nonconvex optimization problem, we derive an analytical necessary optimality condition expressed as a finite set of nonlinear equations, based on which an iterative forward-backward computational procedure similar to the Arimoto-Blahut algorithm is proposed. Convergence of the proposed algorithm to a stationary point of the considered TERMDP is established. Applications of TERMDP are discussed in the context of networked control systems theory and non-equilibrium thermodynamics. The proposed algorithm is applied to an information-constrained maze navigation problem, whereby we study how the price of information qualitatively alters the optimal decision polices.
Submission history
From: Takashi Tanaka [view email][v1] Wed, 30 Aug 2017 03:14:33 GMT (269kb,D)
[v2] Sat, 30 Jun 2018 20:48:21 GMT (248kb,D)
[v3] Wed, 27 May 2020 19:36:47 GMT (329kb,D)
Link back to: arXiv, form interface, contact.