Current browse context:
stat.ML
Change to browse by:
References & Citations
Statistics > Machine Learning
Title: Optimizing the CVaR via Sampling
(Submitted on 15 Apr 2014 (v1), last revised 22 Nov 2014 (this version, v4))
Abstract: Conditional Value at Risk (CVaR) is a prominent risk measure that is being used extensively in various domains. We develop a new formula for the gradient of the CVaR in the form of a conditional expectation. Based on this formula, we propose a novel sampling-based estimator for the CVaR gradient, in the spirit of the likelihood-ratio method. We analyze the bias of the estimator, and prove the convergence of a corresponding stochastic gradient descent algorithm to a local CVaR optimum. Our method allows to consider CVaR optimization in new domains. As an example, we consider a reinforcement learning application, and learn a risk-sensitive controller for the game of Tetris.
Submission history
From: Aviv Tamar [view email][v1] Tue, 15 Apr 2014 10:32:05 GMT (79kb,D)
[v2] Sun, 29 Jun 2014 15:35:36 GMT (91kb,D)
[v3] Tue, 16 Sep 2014 15:32:48 GMT (97kb,D)
[v4] Sat, 22 Nov 2014 14:44:54 GMT (115kb,D)
Link back to: arXiv, form interface, contact.