no code implementations • 23 Feb 2024 • Filippo Lazzati, Mirco Mutti, Alberto Maria Metelli
In this paper, we introduce a novel notion of feasible reward set capturing the opportunities and limitations of the offline setting and we analyze the complexity of its estimation.
no code implementations • 25 Apr 2023 • Alberto Maria Metelli, Filippo Lazzati, Marcello Restelli
We start by formally introducing the problem of estimating the feasible reward set, the corresponding PAC requirement, and discussing the properties of particular classes of rewards.