1 code implementation • 24 May 2024 • Matej Cief, Branislav Kveton, Michal Kompan
In this paper, we study the problem of estimator selection and hyper-parameter tuning in off-policy evaluation.
1 code implementation • 6 May 2023 • Matej Cief, Jacek Golebiowski, Philipp Schmidt, Ziawasch Abedjan, Artur Bekasov
Off-policy evaluation (OPE) methods allow us to compute the expected reward of a policy by using the logged data collected by a different policy.
no code implementations • 6 Jun 2022 • Matej Cief, Branislav Kveton, Michal Kompan
Off-policy learning is a framework for optimizing policies without deploying them, using data collected by another policy.