We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:


Current browse context:


Change to browse by:

References & Citations

DBLP - CS Bibliography


(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Machine Learning

Title: State Relevance for Off-Policy Evaluation

Abstract: Importance sampling-based estimators for off-policy evaluation (OPE) are valued for their simplicity, unbiasedness, and reliance on relatively few assumptions. However, the variance of these estimators is often high, especially when trajectories are of different lengths. In this work, we introduce Omitting-States-Irrelevant-to-Return Importance Sampling (OSIRIS), an estimator which reduces variance by strategically omitting likelihood ratios associated with certain states. We formalize the conditions under which OSIRIS is unbiased and has lower variance than ordinary importance sampling, and we demonstrate these properties empirically.
Comments: ICML 2021
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
Journal reference: Proceedings of the 38th International Conference on Machine Learning, PMLR 139:9537-9546, 2021
Cite as: arXiv:2109.06310 [cs.LG]
  (or arXiv:2109.06310v1 [cs.LG] for this version)

Submission history

From: Simon Shen [view email]
[v1] Mon, 13 Sep 2021 20:40:55 GMT (10949kb,D)

Link back to: arXiv, form interface, contact.