Optimizing the CVaR via Sampling

Tamar, Aviv; Glassner, Yonatan; Mannor, Shie

Full-text links:

Download:

Current browse context:

stat.ML

< prev | next >

new | recent | 1404

Statistics > Machine Learning

Title: Optimizing the CVaR via Sampling

Authors: Aviv Tamar, Yonatan Glassner, Shie Mannor

(Submitted on 15 Apr 2014 (v1), last revised 22 Nov 2014 (this version, v4))

Abstract: Conditional Value at Risk (CVaR) is a prominent risk measure that is being used extensively in various domains. We develop a new formula for the gradient of the CVaR in the form of a conditional expectation. Based on this formula, we propose a novel sampling-based estimator for the CVaR gradient, in the spirit of the likelihood-ratio method. We analyze the bias of the estimator, and prove the convergence of a corresponding stochastic gradient descent algorithm to a local CVaR optimum. Our method allows to consider CVaR optimization in new domains. As an example, we consider a reinforcement learning application, and learn a risk-sensitive controller for the game of Tetris.

Comments:	To appear in AAAI 2015
Subjects:	Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:1404.3862 [stat.ML]
	(or arXiv:1404.3862v4 [stat.ML] for this version)

Submission history

From: Aviv Tamar [view email]
[v1] Tue, 15 Apr 2014 10:32:05 GMT (79kb,D)
[v2] Sun, 29 Jun 2014 15:35:36 GMT (91kb,D)
[v3] Tue, 16 Sep 2014 15:32:48 GMT (97kb,D)
[v4] Sat, 22 Nov 2014 14:44:54 GMT (115kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> stat > arXiv:1404.3862

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Statistics > Machine Learning

Title: Optimizing the CVaR via Sampling

Submission history