no code implementations • 11 Oct 2023 • Mirco Mutti, Riccardo De Santi, Marcello Restelli, Alexander Marx, Giorgia Ramponi
The prior is typically specified as a class of parametric distributions, the design of which can be cumbersome in practice, often resulting in the choice of uninformative priors.
no code implementations • 14 Feb 2022 • Mirco Mutti, Riccardo De Santi, Emanuele Rossi, Juan Felipe Calderon, Michael Bronstein, Marcello Restelli
In this setting, the agent can take a finite amount of reward-free interactions from a subset of these environments.
no code implementations • ICML Workshop URL 2021 • Mirco Mutti, Riccardo De Santi, Marcello Restelli
In the maximum state entropy exploration framework, an agent interacts with a reward-free environment to learn a policy that maximizes the entropy of the expected state visitations it is inducing.
no code implementations • 3 Feb 2022 • Mirco Mutti, Riccardo De Santi, Piersilvio De Bartolomeis, Marcello Restelli
In particular, we show that erroneously optimizing the infinite trials objective in place of the actual finite trials one, as it is usually done, can lead to a significant approximation error.