An Approximate Dynamic Programming Approach to Adversarial Online Learning

Kamble, Vijay; Loiseau, Patrick; Walrand, Jean

Full-text links:

Download:

Current browse context:

cs.GT

< prev | next >

new | recent | 1603

Computer Science > Computer Science and Game Theory

Title: An Approximate Dynamic Programming Approach to Adversarial Online Learning

Authors: Vijay Kamble, Patrick Loiseau, Jean Walrand

(Submitted on 16 Mar 2016 (v1), last revised 26 Oct 2020 (this version, v6))

Abstract: We describe an approximate dynamic programming (ADP) approach to compute approximations of the optimal strategies and of the minimal losses that can be guaranteed in discounted repeated games with vector-valued losses. Such games prominently arise in the analysis of regret in repeated decision-making in adversarial environments, also known as adversarial online learning. At the core of our approach is a characterization of the lower Pareto frontier of the set of expected losses that a player can guarantee in these games as the unique fixed point of a set-valued dynamic programming operator. When applied to the problem of regret minimization with discounted losses, our approach yields algorithms that achieve markedly improved performance bounds compared to off-the-shelf online learning algorithms like Hedge. These results thus suggest the significant potential of ADP-based approaches in adversarial online learning.

Comments:	There was an error in the statement of Proposition 4.2 in the previous version that is fixed in this version
Subjects:	Computer Science and Game Theory (cs.GT); Data Structures and Algorithms (cs.DS); Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1603.04981 [cs.GT]
	(or arXiv:1603.04981v6 [cs.GT] for this version)

Submission history

From: Vijay Kamble [view email]
[v1] Wed, 16 Mar 2016 07:04:24 GMT (359kb,D)
[v2] Wed, 7 Dec 2016 03:08:00 GMT (362kb,D)
[v3] Sun, 31 Dec 2017 23:42:51 GMT (968kb,D)
[v4] Sun, 7 Jan 2018 23:51:48 GMT (1029kb,D)
[v5] Sun, 30 Sep 2018 23:51:16 GMT (1049kb,D)
[v6] Mon, 26 Oct 2020 16:55:34 GMT (2027kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:1603.04981

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computer Science and Game Theory

Title: An Approximate Dynamic Programming Approach to Adversarial Online Learning

Submission history