1 code implementation • 24 May 2024 • Angeliki Kamoutsi, Peter Schmitt-Förster, Tobias Sutter, Volkan Cevher, John Lygeros
This work studies discrete-time discounted Markov decision processes with continuous state and action spaces and addresses the inverse problem of inferring a cost function from observed optimal behavior.
2 code implementations • 22 Sep 2022 • Luca Viano, Angeliki Kamoutsi, Gergely Neu, Igor Krawczuk, Volkan Cevher
Thanks to PPM, we avoid nested policy evaluation and cost updates for online IL appearing in the prior literature.
no code implementations • 31 Dec 2021 • Angeliki Kamoutsi, Goran Banjac, John Lygeros
We consider large-scale Markov decision processes (MDPs) with an unknown cost function and employ stochastic convex optimization tools to address the problem of imitation learning, which consists of learning a policy from a finite set of expert demonstrations.
no code implementations • 28 Dec 2021 • Angeliki Kamoutsi, Goran Banjac, John Lygeros
We consider large-scale Markov decision processes with an unknown cost function and address the problem of learning a policy from a finite set of expert demonstrations.