1 code implementation • 30 Apr 2023 • Remo Sasso, Michelangelo Conserva, Paulo Rauber
Despite remarkable successes, deep reinforcement learning algorithms remain sample inefficient: they require an enormous amount of trial and error to find good policies.
Computational Efficiency Model-based Reinforcement Learning +2
no code implementations • 24 Oct 2022 • Michelangelo Conserva, Paulo Rauber
Second, we introduce Colosseum, a pioneering package that enables empirical hardness analysis and implements a principled benchmark composed of environments that are diverse with respect to different measures of hardness.
1 code implementation • 9 Jul 2020 • Aditya Ramesh, Paulo Rauber, Michelangelo Conserva, Jürgen Schmidhuber
An agent in a nonstationary contextual bandit problem should balance between exploration and the exploitation of (periodic or structured) patterns present in its previous experiences.
1 code implementation • ICLR 2019 • Paulo Rauber, Avinash Ummadisingu, Filipe Mutz, Juergen Schmidhuber
A reinforcement learning agent that needs to pursue different goals across episodes requires a goal-conditional policy.