Risk-Aware Algorithms for Adversarial Contextual Bandits

Sun, Wen; Dey, Debadeepta; Kapoor, Ashish

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 1610

Computer Science > Machine Learning

Title: Risk-Aware Algorithms for Adversarial Contextual Bandits

Authors: Wen Sun, Debadeepta Dey, Ashish Kapoor

(Submitted on 17 Oct 2016)

Abstract: In this work we consider adversarial contextual bandits with risk constraints. At each round, nature prepares a context, a cost for each arm, and additionally a risk for each arm. The learner leverages the context to pull an arm and then receives the corresponding cost and risk associated with the pulled arm. In addition to minimizing the cumulative cost, the learner also needs to satisfy long-term risk constraints -- the average of the cumulative risk from all pulled arms should not be larger than a pre-defined threshold. To address this problem, we first study the full information setting where in each round the learner receives an adversarial convex loss and a convex constraint. We develop a meta algorithm leveraging online mirror descent for the full information setting and extend it to contextual bandit with risk constraints setting using expert advice. Our algorithms can achieve near-optimal regret in terms of minimizing the total cost, while successfully maintaining a sublinear growth of cumulative risk constraint violation.

Comments:	28 pages
Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1610.05129 [cs.LG]
	(or arXiv:1610.05129v1 [cs.LG] for this version)

Submission history

From: Wen Sun [view email]
[v1] Mon, 17 Oct 2016 14:14:43 GMT (339kb)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:1610.05129

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: Risk-Aware Algorithms for Adversarial Contextual Bandits

Submission history