no code implementations • 29 Dec 2023 • Melrose Roderick, Felix Berkenkamp, Fatemeh Sheikholeslami, Zico Kolter
In many real-world problems, there is a limited set of training data, but an abundance of unlabeled data.
no code implementations • 25 Nov 2023 • Melrose Roderick, Gaurav Manek, Felix Berkenkamp, J. Zico Kolter
A key problem in off-policy Reinforcement Learning (RL) is the mismatch, or distribution shift, between the dataset and the distribution over states and actions visited by the learned policy.
1 code implementation • ICLR 2021 • Priya L. Donti, Melrose Roderick, Mahyar Fazlyab, J. Zico Kolter
When designing controllers for safety-critical systems, practitioners often face a challenging tradeoff between robustness and performance.
1 code implementation • 7 Jul 2020 • Melrose Roderick, Vaishnavh Nagarajan, J. Zico Kolter
A key challenge in applying reinforcement learning to safety-critical domains is understanding how to balance exploration (needed to attain good performance on the task) with safety (needed to avoid catastrophic failure).
1 code implementation • 20 Nov 2017 • Melrose Roderick, James Macglashan, Stefanie Tellex
The Deep Q-Network proposed by Mnih et al. [2015] has become a benchmark and building point for much deep reinforcement learning research.
no code implementations • 2 Oct 2017 • Melrose Roderick, Christopher Grimm, Stefanie Tellex
We examine the problem of learning and planning on high-dimensional domains with long horizons and sparse rewards.
2 code implementations • 1 Sep 2017 • Cameron Allen, Kavosh Asadi, Melrose Roderick, Abdel-rahman Mohamed, George Konidaris, Michael Littman
We propose a new algorithm, Mean Actor-Critic (MAC), for discrete-action continuous-state reinforcement learning.
Ranked #1 on Continuous Control on Cart Pole (OpenAI Gym)