no code implementations • 10 May 2024 • Christopher Amato
In this text, I will first give a brief description of the cooperative MARL problem in the form of the Dec-POMDP.
no code implementations • 16 Oct 2023 • Chengguang Xu, Hieu T. Nguyen, Christopher Amato, Lawson L. S. Wong
Directly transferring SOTA navigation policies trained in simulation to the real world is challenging due to the visual domain gap and the absence of prior knowledge about unseen environments.
no code implementations • 3 Oct 2023 • Rohit Bokade, Xiaoning Jin, Christopher Amato
In this study, we propose a communication-based MARL framework for large-scale TSC.
no code implementations • 22 Jul 2023 • Hai Nguyen, Sammie Katt, Yuchen Xiao, Christopher Amato
Bayesian reinforcement learning (BRL), thanks to its sample efficiency and ability to exploit prior knowledge, is uniquely positioned as such a solution method.
no code implementations • 20 Feb 2023 • Enrico Marchesini, Luca Marzari, Alessandro Farinelli, Christopher Amato
In this paper, we investigate an alternative approach that uses domain knowledge to quantify the risk in the proximity of such states by defining a violation metric.
no code implementations • 20 Feb 2023 • Enrico Marchesini, Christopher Amato
Deep Policy Gradient (PG) algorithms employ value networks to drive the learning of parameterized policies and reduce the variance of the gradient estimates.
1 code implementation • 26 Jan 2023 • Brett Daley, Martha White, Christopher Amato, Marlos C. Machado
Off-policy learning from multistep returns is crucial for sample-efficient reinforcement learning, but counteracting off-policy bias without exacerbating variance is challenging.
1 code implementation • 3 Nov 2022 • Hai Nguyen, Andrea Baisero, Dian Wang, Christopher Amato, Robert Platt
Reinforcement learning in partially observable domains is challenging due to the lack of observable state information.
Partially Observable Reinforcement Learning reinforcement-learning +1
no code implementations • 20 Sep 2022 • Yuchen Xiao, Weihao Tan, Christopher Amato
Synchronizing decisions across multiple agents in realistic settings is problematic since it requires agents to wait for other agents to terminate and communicate about termination reliably.
1 code implementation • 2 Jun 2022 • Kevin Esslinger, Robert Platt, Christopher Amato
Such tasks typically require some form of memory, where the agent has access to multiple past observations, in order to perform well.
Partially Observable Reinforcement Learning reinforcement-learning +1
no code implementations • 17 Feb 2022 • Sammie Katt, Hai Nguyen, Frans A. Oliehoek, Christopher Amato
Under this parameterization, in contrast to previous work, the belief over the state and dynamics is a more scalable inference problem.
no code implementations • 3 Jan 2022 • Xueguang Lyu, Andrea Baisero, Yuchen Xiao, Christopher Amato
Centralized Training for Decentralized Execution, where training is done in a centralized offline fashion, has become a popular solution paradigm in Multi-Agent Reinforcement Learning.
Multi-agent Reinforcement Learning reinforcement-learning +1
no code implementations • 23 Dec 2021 • Brett Daley, Christopher Amato
Off-policy learning from multistep returns is crucial for sample-efficient reinforcement learning, particularly in the experience replay setting now commonly used with deep neural networks.
1 code implementation • 6 Dec 2021 • Brett Daley, Christopher Amato
Return caching is a recent strategy that enables efficient minibatch training with multistep estimators (e. g. the {\lambda}-return) for deep reinforcement learning.
1 code implementation • 1 Nov 2021 • Brett Daley, Christopher Amato
Deep Q-Network (DQN) marked a major milestone for reinforcement learning, demonstrating for the first time that human-level control policies could be learned directly from raw visual inputs via reward maximization.
no code implementations • 16 Oct 2021 • Yuchen Xiao, Xueguang Lyu, Christopher Amato
By using this local critic, each agent calculates a baseline to reduce variance on its policy gradient estimation, which results in an expected advantage action-value over other agents' choices that implicitly improves credit assignment.
Multi-agent Reinforcement Learning Policy Gradient Methods +2
no code implementations • 29 Sep 2021 • Yuchen Xiao, Weihao Tan, Christopher Amato
Many realistic multi-agent problems naturally require agents to be capable of performing asynchronously without waiting for other agents to terminate (e. g., multi-robot domains).
no code implementations • 10 Jun 2021 • Brett Daley, Christopher Amato
Adam is an adaptive gradient method that has experienced widespread adoption due to its fast and reliable training performance.
1 code implementation • 7 Jun 2021 • Andrea Baisero, Christopher Amato
We show that there is a mismatch between optimal POMDP policies and the optimal PSR policies derived from approximate rewards.
no code implementations • 7 Jun 2021 • Chengguang Xu, Christopher Amato, Lawson L. S. Wong
In this work, we propose an approach that leverages a rough 2-D map of the environment to navigate in novel environments without requiring further learning.
no code implementations • 25 May 2021 • Andrea Baisero, Christopher Amato
In partially observable reinforcement learning, offline training gives access to latent information which is not available during online training and/or execution, such as the system state.
Partially Observable Reinforcement Learning reinforcement-learning +1
1 code implementation • 26 Apr 2021 • Mohammadreza Sharif, Deniz Erdogmus, Christopher Amato, Taskin Padir
State-of-the-art human-in-the-loop robot grasping is hugely suffered by Electromyography (EMG) inference robustness issues.
no code implementations • 17 Mar 2021 • Roi Yehoshua, Juan Heredia-Juesas, Yushu Wu, Christopher Amato, Jose Martinez-Lorenzo
Targets search and detection encompasses a variety of decision problems such as coverage, surveillance, search, observing and pursuit-evasion along with others.
no code implementations • 22 Feb 2021 • Brett Daley, Cameron Hickert, Christopher Amato
Our theory prescribes a special non-uniform distribution to cancel this effect, and we propose a stratified sampling scheme to efficiently implement it.
no code implementations • 8 Feb 2021 • Xueguang Lyu, Yuchen Xiao, Brett Daley, Christopher Amato
Centralized Training for Decentralized Execution, where agents are trained offline using centralized information but execute in a decentralized manner online, has gained popularity in the multi-agent reinforcement learning community.
no code implementations • 27 Jan 2021 • Ingy Elsayed-Aly, Suda Bharadwaj, Christopher Amato, Rüdiger Ehlers, Ufuk Topcu, Lu Feng
Multi-agent reinforcement learning (MARL) has been increasingly used in a wide range of safety-critical applications, which require guaranteed safety (e. g., no unsafe states are ever visited) during the learning process. Unfortunately, current MARL methods do not have safety guarantees.
Multi-agent Reinforcement Learning reinforcement-learning +1
1 code implementation • 19 Oct 2020 • Hai Nguyen, Brett Daley, Xinchao Song, Christopher Amato, Robert Platt
Many important robotics problems are partially observable in the sense that a single visual or force-feedback measurement is insufficient to reconstruct the state.
1 code implementation • 3 Oct 2020 • Brett Daley, Christopher Amato
Many popular adaptive gradient methods such as Adam and RMSProp rely on an exponential moving average (EMA) to normalize their stepsizes.
no code implementations • 18 Apr 2020 • Yuchen Xiao, Joshua Hoffman, Christopher Amato
In real-world multi-robot systems, performing high-quality, collaborative behaviors requires robots to asynchronously reason about high-level action selection at varying time durations.
1 code implementation • NeurIPS 2019 • Brett Daley, Christopher Amato
Modern deep reinforcement learning methods have departed from the incremental learning required for eligibility traces, rendering the implementation of the λ-return difficult in this context.
no code implementations • 24 Sep 2019 • Christopher Amato, Andrea Baisero
We propose to combine goal recognition with other observer tasks in order to obtain \emph{active goal recognition} (AGR).
no code implementations • 19 Sep 2019 • Yuchen Xiao, Joshua Hoffman, Tian Xia, Christopher Amato
In many real-world multi-robot tasks, high-quality solutions often require a team of robots to perform asynchronous actions under decentralized control.
no code implementations • 15 Dec 2018 • Xueguang Lyu, Christopher Amato
When multiple agents learn in a decentralized manner, the environment appears non-stationary from the perspective of an individual agent due to the exploration and learning of the other agents.
no code implementations • 14 Nov 2018 • Sammie Katt, Frans Oliehoek, Christopher Amato
Bayesian approaches provide a principled solution to the exploration-exploitation trade-off in Reinforcement Learning.
1 code implementation • 23 Oct 2018 • Brett Daley, Christopher Amato
Modern deep reinforcement learning methods have departed from the incremental learning required for eligibility traces, rendering the implementation of the $\lambda$-return difficult in this context.
no code implementations • ICML 2017 • Sammie Katt, Frans A. Oliehoek, Christopher Amato
The POMDP is a powerful framework for reasoning under outcome and information uncertainty, but constructing an accurate POMDP model is difficult.
no code implementations • 20 May 2018 • Shayegan Omidshafiei, Dong-Ki Kim, Miao Liu, Gerald Tesauro, Matthew Riemer, Christopher Amato, Murray Campbell, Jonathan P. How
The problem of teaching to improve agent learning has been investigated by prior works, but these approaches make assumptions that prevent application of teaching to general multiagent problems, or require domain expertise for problems they can apply to.
no code implementations • 17 Oct 2017 • Trong Nghia Hoang, Yuchen Xiao, Kavinayan Sivakumar, Christopher Amato, Jonathan How
The practicality of existing works addressing this challenge is limited to only small-scale synchronous decision-making scenarios or a single agent planning its best response against a single adversary with fixed, procedurally characterized strategies.
no code implementations • 24 Jul 2017 • Miao Liu, Kavinayan Sivakumar, Shayegan Omidshafiei, Christopher Amato, Jonathan P. How
We implement two variants of multi-robot Search and Rescue (SAR) domains (with and without obstacles) on hardware to demonstrate the learned policies can effectively control a team of distributed robots to cooperate in a partially observable stochastic environment.
no code implementations • ICML 2017 • Shayegan Omidshafiei, Jason Pazis, Christopher Amato, Jonathan P. How, John Vian
Many real-world tasks involve multiple agents with partial observability and limited communication.
Multi-agent Reinforcement Learning reinforcement-learning +1
no code implementations • 1 May 2015 • Miao Liu, Christopher Amato, Xuejun Liao, Lawrence Carin, Jonathan P. How
Expectation maximization (EM) has recently been shown to be an efficient algorithm for learning finite-state controllers (FSCs) in large decentralized POMDPs (Dec-POMDPs).
no code implementations • 20 Feb 2015 • Shayegan Omidshafiei, Ali-akbar Agha-mohammadi, Christopher Amato, Jonathan P. How
To allow for a high-level representation that is natural for multi-robot problems and scalable to large discrete and continuous problems, this paper extends the Dec-POMDP model to the decentralized partially observable semi-Markov decision process (Dec-POSMDP).
1 code implementation • 4 Apr 2014 • Christopher Amato, Frans A. Oliehoek
Online, sample-based planning algorithms for POMDPs have shown great promise in scaling to problems with large state spaces, but they become intractable for large action and observation spaces.
no code implementations • 12 Feb 2014 • Christopher Amato, George D. Konidaris, Gabriel Cruz, Christopher A. Maynor, Jonathan P. How, Leslie P. Kaelbling
We describe a probabilistic framework for synthesizing control policies for general multi-robot systems, given environment and sensor models and a cost function.
no code implementations • 4 Feb 2014 • Frans Adriaan Oliehoek, Matthijs T. J. Spaan, Christopher Amato, Shimon Whiteson
We provide theoretical guarantees that, when a suitable heuristic is used, both incremental clustering and incremental expansion yield algorithms that are both complete and search equivalent.
no code implementations • 15 Jan 2014 • Daniel S. Bernstein, Christopher Amato, Eric A. Hansen, Shlomo Zilberstein
The main contribution of this paper is an optimal policy iteration algorithm for solving DEC-POMDPs.