Search Results for author: Matej Cief

Cross-Validated Off-Policy Evaluation

In this paper, we study the problem of estimator selection and hyper-parameter tuning in off-policy evaluation.

Paper
Code

Off-policy evaluation (OPE) methods allow us to compute the expected reward of a policy by using the logged data collected by a different policy.

Paper
Code

Off-policy learning is a framework for optimizing policies without deploying them, using data collected by another policy.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.