no code implementations • 4 Mar 2024 • Ariyan Bighashdel, Yongzhao Wang, Stephen Mcaleer, Rahul Savani, Frans A. Oliehoek
In game theory, a game refers to a model of interaction among rational decision-makers or players, making choices with the goal of achieving their individual objectives.
no code implementations • 19 Feb 2024 • Davide Mambelli, Stephan Bongers, Onno Zoeter, Matthijs T. J. Spaan, Frans A. Oliehoek
A well-established off-policy objective is the excursion objective.
no code implementations • 19 Nov 2023 • Zuzanna Osika, Jazmin Zatarain Salazar, Diederik M. Roijers, Frans A. Oliehoek, Pradeep K. Murukannaiah
We present a review that unifies decision-support methods for exploring the solutions produced by multi-objective optimization (MOO) algorithms.
no code implementations • 4 Jun 2023 • Miguel Suau, Matthijs T. J. Spaan, Frans A. Oliehoek
In this paper, we provide a mathematical characterization of this phenomenon, which we refer to as policy confounding, and show, through a series of examples, when and how it occurs in practice.
no code implementations • 1 Jun 2023 • Jinke He, Thomas M. Moerland, Frans A. Oliehoek
Model-based reinforcement learning has drawn considerable interest in recent years, given its promise to improve sample efficiency.
no code implementations • 29 May 2023 • Robert Loftin, Mustafa Mert Çelikok, Frans A. Oliehoek
Multiagent systems deployed in the real world need to cooperate with other agents (including humans) nearly as effectively as these agents cooperate with one another.
no code implementations • 27 Feb 2023 • Aleksander Czechowski, Frans A. Oliehoek
One of the main challenges of multi-agent learning lies in establishing convergence of the algorithms, as, in general, a collection of individual, self-serving agents is not guaranteed to converge with their joint policy, when learning concurrently.
no code implementations • 7 Feb 2023 • Robert Loftin, Mustafa Mert Çelikok, Herke van Hoof, Samuel Kaski, Frans A. Oliehoek
A natural solution concept for many multiagent settings is the Stackelberg equilibrium, under which a ``leader'' agent selects a strategy that maximizes its own payoff assuming the ``follower'' chooses their best response to this strategy.
no code implementations • NeurIPS 2021 • Rolf A. N. Starre, Marco Loog, Elena Congeduti, Frans A. Oliehoek
This result makes it possible to extend the guarantees of existing MBRL algorithms to the setting with abstraction.
Model-based Reinforcement Learning reinforcement-learning +1
1 code implementation • 1 Jul 2022 • Miguel Suau, Jinke He, Mustafa Mert Çelikok, Matthijs T. J. Spaan, Frans A. Oliehoek
Due to its high sample complexity, simulation is, as of today, critical for the successful application of reinforcement learning.
no code implementations • 20 Jun 2022 • Robert Loftin, Frans A. Oliehoek
Learning to cooperate with other agents is challenging when those agents also possess the ability to adapt to our own behavior.
no code implementations • 3 Apr 2022 • Mustafa Mert Çelikok, Frans A. Oliehoek, Samuel Kaski
Centaurs are half-human, half-AI decision-makers where the AI's goal is to complement the human.
no code implementations • 17 Feb 2022 • Sammie Katt, Hai Nguyen, Frans A. Oliehoek, Christopher Amato
Under this parameterization, in contrast to previous work, the belief over the state and dynamics is a more scalable inference problem.
no code implementations • 3 Feb 2022 • Miguel Suau, Jinke He, Matthijs T. J. Spaan, Frans A. Oliehoek
Learning effective policies for real-world problems is still an open challenge for the field of reinforcement learning (RL).
1 code implementation • 27 Jan 2022 • Jinke He, Miguel Suau, Hendrik Baier, Michael Kaisers, Frans A. Oliehoek
To plan reliably and efficiently while the approximate simulator is learning, we develop a method that adaptively decides which simulator to use for every simulation, based on a statistic that measures the accuracy of the approximate simulator.
1 code implementation • 30 Dec 2021 • Markus Peschl, Arkady Zgonnikov, Frans A. Oliehoek, Luciano C. Siebert
Inferring reward functions from demonstrations and pairwise preferences are auspicious approaches for aligning Reinforcement Learning (RL) agents with human intentions.
1 code implementation • ICLR 2022 • Elise van der Pol, Herke van Hoof, Frans A. Oliehoek, Max Welling
This paper introduces Multi-Agent MDP Homomorphic Networks, a class of networks that allows distributed execution using only local information, yet is able to share experience between global symmetries in the joint state-action space of cooperative multi-agent systems.
no code implementations • 21 Dec 2020 • Jacopo Castellini, Sam Devlin, Frans A. Oliehoek, Rahul Savani
Policy gradient methods have become one of the most popular classes of algorithms for multi-agent reinforcement learning.
no code implementations • 16 Nov 2020 • Wook Lee, Frans A. Oliehoek
One of the aspects that makes this problem challenging to optimize, is that measuring the performance of candidate configurations with simulation can be computationally expensive, particularly in the post-layout design.
1 code implementation • 3 Nov 2020 • Elena Congeduti, Alexander Mey, Frans A. Oliehoek
Sequential decision making techniques hold great promise to improve the performance of many real-world systems, but computational complexity hampers their principled application.
1 code implementation • NeurIPS 2020 • Mikko Lauri, Frans A. Oliehoek
The accuracy is quantified by a centralized prediction reward determined by a centralized decision-maker who perceives the observations gathered by all agents after the task ends.
1 code implementation • NeurIPS 2020 • Jinke He, Miguel Suau, Frans A. Oliehoek
In this work, we propose influence-augmented online planning, a principled method to transform a factored simulator of the entire environment into a local simulator that samples only the state variables that are most relevant to the observation and reward of the planning agent and captures the incoming influence from the rest of the environment using machine learning methods.
no code implementations • 21 Sep 2020 • Yash Satsangi, Shimon Whiteson, Frans A. Oliehoek, Henri Bouma
Automated tracking is key to many computer vision applications.
no code implementations • 21 Sep 2020 • Yash Satsangi, Shimon Whiteson, Frans A. Oliehoek, Matthijs T. J. Spaan
Furthermore, we show that, under certain conditions, including submodularity, the value function computed using greedy PBVI is guaranteed to have bounded error with respect to the optimal value function.
2 code implementations • NeurIPS 2020 • Elise van der Pol, Daniel E. Worrall, Herke van Hoof, Frans A. Oliehoek, Max Welling
MDP homomorphic networks are neural networks that are equivariant under symmetries in the joint state-action space of an MDP.
no code implementations • 15 May 2020 • Flávia Alves, Martin Gairing, Frans A. Oliehoek, Thanh-Toan Do
In HAR, the development of Activity Recognition models is dependent upon the data captured by these devices and the methods used to analyse them, which directly affect performance metrics.
no code implementations • 27 Apr 2020 • Christian Muench, Frans A. Oliehoek, Dariu M. Gavrila
Traffic scenarios are inherently interactive.
no code implementations • NeurIPS 2021 • João P. Abrantes, Arnaldo J. Abrantes, Frans A. Oliehoek
This work proposes Evolution via Evolutionary Reward (EvER) that allows learning to single-handedly drive the search for policies with increasingly evolutionary fitness by ensuring the alignment of the reward function with the fitness function.
no code implementations • 19 Mar 2020 • Aleksander Czechowski, Frans A. Oliehoek
Decentralized online planning can be an attractive paradigm for cooperative multi-agent systems, due to improved scalability and robustness.
1 code implementation • 27 Feb 2020 • Elise van der Pol, Thomas Kipf, Frans A. Oliehoek, Max Welling
We introduce a contrastive loss function that enforces action equivariance on the learned representations.
1 code implementation • 18 Nov 2019 • Miguel Suau, Jinke He, Elena Congeduti, Rolf A. N. Starre, Aleksander Czechowski, Frans A. Oliehoek
Due to its perceptual limitations, an agent may have too little information about the state of the environment to act optimally.
no code implementations • 22 Jul 2019 • Frans A. Oliehoek, Stefan Witwicki, Leslie P. Kaelbling
In these ways, this paper deepens our understanding of abstraction in a wide range of sequential decision making settings, providing the basis for new approaches and algorithms for a large class of problems.
no code implementations • 8 Nov 2018 • Feryal Behbahani, Kyriacos Shiarlis, Xi Chen, Vitaly Kurin, Sudhanshu Kasewa, Ciprian Stirbu, João Gomes, Supratik Paul, Frans A. Oliehoek, João Messias, Shimon Whiteson
Learning from demonstration (LfD) is useful in settings where hand-coding behaviour or a reward function is impractical.
no code implementations • 18 Jun 2018 • Frans A. Oliehoek, Rahul Savani, Jose Gallego, Elise van der Pol, Roderich Groß
Save for some special cases, current training methods for Generative Adversarial Networks (GANs) are at best guaranteed to converge to a `local Nash equilibrium` (LNE).
no code implementations • ICML 2017 • Sammie Katt, Frans A. Oliehoek, Christopher Amato
The POMDP is a powerful framework for reasoning under outcome and information uncertainty, but constructing an accurate POMDP model is difficult.
no code implementations • 2 Dec 2017 • Frans A. Oliehoek, Rahul Savani, Jose Gallego-Posada, Elise van der Pol, Edwin D. de Jong, Roderich Gross
We introduce Generative Adversarial Network Games (GANGs), which explicitly model a finite zero-sum game between a generator ($G$) and classifier ($C$) that use mixed strategies.
no code implementations • 22 Jun 2016 • Auke J. Wiggers, Frans A. Oliehoek, Diederik M. Roijers
Zero-sum stochastic games provide a rich model for competitive decision making.
no code implementations • 25 Feb 2016 • Yash Satsangi, Shimon Whiteson, Frans A. Oliehoek
Submodular function maximization finds application in a variety of real-world decision-making problems.
no code implementations • 30 Nov 2015 • Athirai A. Irissappane, Frans A. Oliehoek, Jie Zhang
In multiagent e-marketplaces, buying agents need to select good sellers by querying other buyers (called advisors).
no code implementations • 29 Nov 2015 • Joris Scharpff, Diederik M. Roijers, Frans A. Oliehoek, Matthijs T. J. Spaan, Mathijs M. de Weerdt
In cooperative multi-agent sequential decision making under uncertainty, agents must coordinate to find an optimal joint policy that maximises joint value.
no code implementations • 29 Nov 2015 • Philipp Robbel, Frans A. Oliehoek, Mykel J. Kochenderfer
We present an approach to mitigate this limitation for certain types of multiagent systems, exploiting a property that can be thought of as "anonymous influence" in the factored MDP.
no code implementations • 18 Feb 2015 • Frans A. Oliehoek, Matthijs T. J. Spaan, Stefan Witwicki
Recent years have seen the development of methods for multiagent planning under uncertainty that scale to tens or even hundreds of agents.
1 code implementation • 4 Apr 2014 • Christopher Amato, Frans A. Oliehoek
Online, sample-based planning algorithms for POMDPs have shown great promise in scaling to problems with large state spaces, but they become intractable for large action and observation spaces.
no code implementations • 1 Aug 2011 • Frans A. Oliehoek, Shimon Whiteson, Matthijs T. J. Spaan
Such problems can be modeled as collaborative Bayesian games in which each agent receives private information in the form of its type.