no code implementations • 3 Nov 2023 • Jonathan Colaço Carr, Prakash Panangaden, Doina Precup
Current results guaranteeing the existence of optimal policies in LfPF problems assume that both the preferences and transition dynamics are determined by a Markov Decision Process.
no code implementations • 5 Oct 2023 • Pablo Samuel Castro, Tyler Kastner, Prakash Panangaden, Mark Rowland
Behavioural metrics have been shown to be an effective mechanism for constructing representations in reinforcement learning.
2 code implementations • 9 May 2023 • Prakash Panangaden, Sahand Rezaei-Shoshtari, Rosie Zhao, David Meger, Doina Precup
Our policy gradient results allow for leveraging approximate symmetries of the environment for policy optimization.
1 code implementation • 15 Sep 2022 • Sahand Rezaei-Shoshtari, Rosie Zhao, Prakash Panangaden, David Meger, Doina Precup
Abstraction has been widely studied as a way to improve the efficiency and generalization of reinforcement learning algorithms.
no code implementations • 16 Aug 2022 • Chin-wei Huang, Milad Aghajohari, Avishek Joey Bose, Prakash Panangaden, Aaron Courville
In this work, we generalize continuous-time diffusion models to arbitrary Riemannian manifolds and derive a variational framework for likelihood estimation.
no code implementations • 5 Jun 2021 • Clara Lacroce, Prakash Panangaden, Guillaume Rabusseau
The objective is to obtain a weighted finite automaton (WFA) that fits within a given size constraint and which mimics the behaviour of the original model while minimizing some notion of distance between the black box and the extracted WFA.
2 code implementations • NeurIPS 2021 • Pablo Samuel Castro, Tyler Kastner, Prakash Panangaden, Mark Rowland
We present a new behavioural distance over the state space of a Markov decision process, and demonstrate the use of this distance as an effective means of shaping the learnt representations of deep reinforcement learning agents.
no code implementations • 3 Nov 2020 • Gavin McCracken, Colin Daniels, Rosie Zhao, Anna Brandenberger, Prakash Panangaden, Doina Precup
Policy gradient methods are extensively used in reinforcement learning as a way to optimize expected return.
no code implementations • 27 Mar 2020 • Philip Amortila, Doina Precup, Prakash Panangaden, Marc G. Bellemare
We present a distributional approach to theoretical analyses of reinforcement learning algorithms for constant step-sizes.
1 code implementation • ICML 2020 • Avishek Joey Bose, Ariella Smofsky, Renjie Liao, Prakash Panangaden, William L. Hamilton
One effective solution is the use of normalizing flows \cut{defined on Euclidean spaces} to construct flexible posterior distributions.
no code implementations • NeurIPS 2015 • Gheorghe Comanici, Doina Precup, Prakash Panangaden
We provide a theoretical framework for analyzing basis function construction for linear value function approximation in Markov Decision Processes (MDPs).
no code implementations • 28 Dec 2014 • Bob Coecke, Ichiro Hasuo, Prakash Panangaden
The first QPL under the new name Quantum Physics and Logic was held in Reykjavik (2008), followed by Oxford (2009 and 2010), Nijmegen (2011), Brussels (2012) and Barcelona (2013).