no code implementations • 25 Mar 2024 • Tom Kuipers, Renukanandan Tumu, Shuo Yang, Milad Kazemi, Rahul Mangharam, Nicola Paoletti
In this work, we introduce MA-COPP, the first conformal prediction method to solve OPP problems involving multi-agent systems, deriving joint prediction regions for all agents' trajectories when one or more "ego" agents change their policies.
no code implementations • 13 Feb 2024 • Milad Kazemi, Jessica Lally, Ekaterina Tishchenko, Hana Chockler, Nicola Paoletti
Our work addresses a fundamental problem in the context of counterfactual inference for Markov Decision Processes (MDPs).
no code implementations • 15 Dec 2023 • Milad Kazemi, Mateo Perez, Fabio Somenzi, Sadegh Soudjani, Ashutosh Trivedi, Alvaro Velasquez
We present a modular approach to \emph{reinforcement learning} (RL) in environments consisting of simpler components evolving in parallel.
no code implementations • 16 Dec 2022 • Milad Kazemi, Nicola Paoletti
We introduce $\textit{PCFTL (Probabilistic CounterFactual Temporal Logic)}$, a new probabilistic temporal logic for the verification of Markov Decision Processes (MDP).
no code implementations • 6 Aug 2022 • Abolfazl Lavaei, Mateo Perez, Milad Kazemi, Fabio Somenzi, Sadegh Soudjani, Ashutosh Trivedi, Majid Zamani
A key contribution is to leverage the convergence results for adversarial RL (minimax Q-learning) on finite stochastic arenas to provide control strategies maximizing the probability of satisfaction over the network of continuous-space systems.
no code implementations • 16 Jun 2022 • Milad Kazemi, Rupak Majumdar, Mahmoud Salamati, Sadegh Soudjani, Ben Wooding
The growth bound together with the sampled trajectories are then used to construct the abstraction and synthesise a controller.
no code implementations • 4 May 2020 • Milad Kazemi, Sadegh Soudjani
We use this procedure to guide the RL algorithm towards a policy that converges to an optimal policy under suitable assumptions on the process.