We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

math.OC

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Mathematics > Optimization and Control

Title: Online Abstract Dynamic Programming with Contractive Models

Abstract: This paper addresses the abstract dynamic programming (DP) in the online scenario, where the abstract DP mapping is time-varying, instead of static. In this case, optimal costs and policies at different time instants are not the same in general, and the problem amounts to tracking time-varying optimal costs and policies, which is of interest to many practical problems. It is thus necessary to analyze the performance of classical value iteration (VI) and policy iteration (PI) algorithms in the online case. In doing so, this paper develops and provides the theoretical analysis for several online algorithms, including approximate online VI, online PI, approximate online PI, online optimistic PI, approximate online optimistic PI, and asynchronous online PI and VI algorithms. It is proved that the tracking error bounds for all algorithms critically depend upon the largest difference between any two consecutive abstract mappings. Meanwhile, examples are presented to illustrate the theoretical results.
Subjects: Optimization and Control (math.OC)
Cite as: arXiv:2107.01383 [math.OC]
  (or arXiv:2107.01383v1 [math.OC] for this version)

Submission history

From: Xiuxian Li [view email]
[v1] Sat, 3 Jul 2021 08:58:21 GMT (19kb)

Link back to: arXiv, form interface, contact.