We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.LG

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Machine Learning

Title: Temporal Regularization in Markov Decision Process

Abstract: Several applications of Reinforcement Learning suffer from instability due to high variance. This is especially prevalent in high dimensional domains. Regularization is a commonly used technique in machine learning to reduce variance, at the cost of introducing some bias. Most existing regularization techniques focus on spatial (perceptual) regularization. Yet in reinforcement learning, due to the nature of the Bellman equation, there is an opportunity to also exploit temporal regularization based on smoothness in value estimates over trajectories. This paper explores a class of methods for temporal regularization. We formally characterize the bias induced by this technique using Markov chain concepts. We illustrate the various characteristics of temporal regularization via a sequence of simple discrete and continuous MDPs, and show that the technique provides improvement even in high-dimensional Atari games.
Comments: Published as a conference paper at NIPS 2018
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as: arXiv:1811.00429 [cs.LG]
  (or arXiv:1811.00429v2 [cs.LG] for this version)

Submission history

From: Pierre Thodoroff [view email]
[v1] Thu, 1 Nov 2018 15:21:45 GMT (471kb,D)
[v2] Wed, 10 Apr 2019 23:03:53 GMT (471kb,D)

Link back to: arXiv, form interface, contact.