no code implementations • 9 Oct 2022 • Arash Bahari Kordabad, Mario Zanon, Sebastien Gros
This paper shows that the optimal policy and value functions of a Markov Decision Process (MDP), either discounted or not, can be captured by a finite-horizon undiscounted Optimal Control Problem (OCP), even if based on an inexact model.
no code implementations • 31 Mar 2022 • Arash Bahari Kordabad, Sebastien Gros
This paper discusses the functional stability of closed-loop Markov Chains under optimal policies resulting from a discounted optimality criterion, forming Markov Decision Processes (MDPs).
no code implementations • 25 Mar 2022 • Arash Bahari Kordabad, Hossein Nejatbakhsh Esfahani, WenQi Cai, Sebastien Gros
We show that the approximate Hessian converges to the exact Hessian at the optimal policy, and allows for a superlinear convergence in the learning, provided that the policy parametrization is rich.
no code implementations • 24 May 2021 • Arash Bahari Kordabad, Sebastien Gros
In the Economic Nonlinear Model Predictive (ENMPC) context, closed-loop stability relates to the existence of a storage function satisfying a dissipation inequality.
no code implementations • 6 Apr 2021 • Hossein Nejatbakhsh Esfahani, Arash Bahari Kordabad, Sebastien Gros
We present a Reinforcement Learning-based Robust Nonlinear Model Predictive Control (RL-RNMPC) framework for controlling nonlinear systems in the presence of disturbances and uncertainties.
no code implementations • 6 Apr 2021 • Arash Bahari Kordabad, WenQi Cai, Sebastien Gros
In this paper, we are interested in optimal control problems with purely economic costs, which often yield optimal policies having a (nearly) bang-bang structure.
no code implementations • 6 Apr 2021 • Arash Bahari Kordabad, Hossein Nejatbakhsh Esfahani, Sebastien Gros
In this paper, we discuss the deterministic policy gradient using the Actor-Critic methods based on the linear compatible advantage function approximator, where the input spaces are continuous.
no code implementations • 22 Mar 2021 • Hossein Nejatbakhsh Esfahani, Arash Bahari Kordabad, Sebastien Gros
This paper proposes an observer-based framework for solving Partially Observable Markov Decision Processes (POMDPs) when an accurate model is not available.
no code implementations • 22 Mar 2021 • Arash Bahari Kordabad, Hossein Nejatbakhsh Esfahani, Anastasios M. Lekkas, Sébastien Gros
A scenario-tree robust MPC is used to handle potential failures of the ship thrusters.