no code implementations • 20 Feb 2020 • David Venuto, Jhelum Chakravorty, Leonard Boussioux, Junhao Wang, Gavin McCracken, Doina Precup
Explicit engineering of reward functions for given environments has been a major hindrance to reinforcement learning methods.
1 code implementation • 28 Nov 2019 • Jhelum Chakravorty, Nadeem Ward, Julien Roy, Maxime Chevalier-Boisvert, Sumana Basu, Andrei Lupu, Doina Precup
In this paper, we investigate learning temporal abstractions in cooperative multi-agent systems, using the options framework (Sutton et al, 1999).
1 code implementation • 24 Sep 2019 • David Venuto, Leonard Boussioux, Junhao Wang, Rola Dali, Jhelum Chakravorty, Yoshua Bengio, Doina Precup
We define avoidance learning as the process of optimizing the agent's reward while avoiding dangerous behaviors given by a demonstrator.
no code implementations • 31 Mar 2017 • Jhelum Chakravorty, Aditya Mahajan
Sufficient conditions are identified under which the value function and the optimal strategy of a Markov decision process (MDP) are even and quasi-convex in the state.
Optimization and Control