1 code implementation • 30 Apr 2023 • Remo Sasso, Michelangelo Conserva, Paulo Rauber
Despite remarkable successes, deep reinforcement learning algorithms remain sample inefficient: they require an enormous amount of trial and error to find good policies.
Computational Efficiency Model-based Reinforcement Learning +2
no code implementations • 24 Oct 2022 • Michelangelo Conserva, Paulo Rauber
Second, we introduce Colosseum, a pioneering package that enables empirical hardness analysis and implements a principled benchmark composed of environments that are diverse with respect to different measures of hardness.
1 code implementation • 26 May 2021 • Michelangelo Conserva, Marc Peter Deisenroth, K S Sesh Kumar
Many algorithms for ranked data become computationally intractable as the number of objects grows due to the complex geometric structure induced by rankings.
1 code implementation • 9 Jul 2020 • Aditya Ramesh, Paulo Rauber, Michelangelo Conserva, Jürgen Schmidhuber
An agent in a nonstationary contextual bandit problem should balance between exploration and the exploitation of (periodic or structured) patterns present in its previous experiences.