Search Results for author: Trevor Schwantes

Found 1 papers, 0 papers with code

Probabilistic Offline Policy Ranking with Approximate Bayesian Computation

no code implementations17 Dec 2023 Longchao Da, Porter Jenkins, Trevor Schwantes, Jeffrey Dotson, Hua Wei

In this paper, we present Probabilistic Offline Policy Ranking (POPR), a framework to address OPR problems by leveraging expert data to characterize the probability of a candidate policy behaving like experts, and approximating its entire performance posterior distribution to help with ranking.

Off-policy evaluation

Cannot find the paper you are looking for? You can Submit a new open access paper.