Search Results for author: Frans A. Oliehoek

Found 44 papers, 11 papers with code

Policy Space Response Oracles: A Survey

no code implementations • 4 Mar 2024 • Ariyan Bighashdel, Yongzhao Wang, Stephen Mcaleer, Rahul Savani, Frans A. Oliehoek

In game theory, a game refers to a model of interaction among rational decision-makers or players, making choices with the goal of achieving their individual objectives.

Position

Paper
Add Code

When Do Off-Policy and On-Policy Policy Gradient Methods Align?

no code implementations • 19 Feb 2024 • Davide Mambelli, Stephan Bongers, Onno Zoeter, Matthijs T. J. Spaan, Frans A. Oliehoek

A well-established off-policy objective is the excursion objective.

Policy Gradient Methods

Paper
Add Code

What Lies beyond the Pareto Front? A Survey on Decision-Support Methods for Multi-Objective Optimization

no code implementations • 19 Nov 2023 • Zuzanna Osika, Jazmin Zatarain Salazar, Diederik M. Roijers, Frans A. Oliehoek, Pradeep K. Murukannaiah

We present a review that unifies decision-support methods for exploring the solutions produced by multi-objective optimization (MOO) algorithms.

Ethics

Paper
Add Code

Bad Habits: Policy Confounding and Out-of-Trajectory Generalization in RL

no code implementations • 4 Jun 2023 • Miguel Suau, Matthijs T. J. Spaan, Frans A. Oliehoek

In this paper, we provide a mathematical characterization of this phenomenon, which we refer to as policy confounding, and show, through a series of examples, when and how it occurs in practice.

Paper
Add Code

What model does MuZero learn?

no code implementations • 1 Jun 2023 • Jinke He, Thomas M. Moerland, Frans A. Oliehoek

Model-based reinforcement learning has drawn considerable interest in recent years, given its promise to improve sample efficiency.

Model-based Reinforcement Learning reinforcement-learning

Paper
Add Code

Towards a Unifying Model of Rationality in Multiagent Systems

no code implementations • 29 May 2023 • Robert Loftin, Mustafa Mert Çelikok, Frans A. Oliehoek

Multiagent systems deployed in the real world need to cooperate with other agents (including humans) nearly as effectively as these agents cooperate with one another.

Paper
Add Code

Safe Multi-agent Learning via Trapping Regions

no code implementations • 27 Feb 2023 • Aleksander Czechowski, Frans A. Oliehoek

One of the main challenges of multi-agent learning lies in establishing convergence of the algorithms, as, in general, a collection of individual, self-serving agents is not guaranteed to converge with their joint policy, when learning concurrently.

Generative Adversarial Network

Paper
Add Code

Uncoupled Learning of Differential Stackelberg Equilibria with Commitments

no code implementations • 7 Feb 2023 • Robert Loftin, Mustafa Mert Çelikok, Herke van Hoof, Samuel Kaski, Frans A. Oliehoek

A natural solution concept for many multiagent settings is the Stackelberg equilibrium, under which a ``leader'' agent selects a strategy that maximizes its own payoff assuming the ``follower'' chooses their best response to this strategy.

Multi-agent Reinforcement Learning

Paper
Add Code

An Analysis of Model-Based Reinforcement Learning From Abstracted Observations

no code implementations • NeurIPS 2021 • Rolf A. N. Starre, Marco Loog, Elena Congeduti, Frans A. Oliehoek

This result makes it possible to extend the guarantees of existing MBRL algorithms to the setting with abstraction.

Model-based Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Distributed Influence-Augmented Local Simulators for Parallel MARL in Large Networked Systems

1 code implementation • 1 Jul 2022 • Miguel Suau, Jinke He, Mustafa Mert Çelikok, Matthijs T. J. Spaan, Frans A. Oliehoek

Due to its high sample complexity, simulation is, as of today, critical for the successful application of reinforcement learning.

Paper
Code

On the Impossibility of Learning to Cooperate with Adaptive Partner Strategies in Repeated Games

no code implementations • 20 Jun 2022 • Robert Loftin, Frans A. Oliehoek

Learning to cooperate with other agents is challenging when those agents also possess the ability to adapt to our own behavior.

Paper
Add Code

Best-Response Bayesian Reinforcement Learning with Bayes-adaptive POMDPs for Centaurs

no code implementations • 3 Apr 2022 • Mustafa Mert Çelikok, Frans A. Oliehoek, Samuel Kaski

Centaurs are half-human, half-AI decision-makers where the AI's goal is to complement the human.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

BADDr: Bayes-Adaptive Deep Dropout RL for POMDPs

no code implementations • 17 Feb 2022 • Sammie Katt, Hai Nguyen, Frans A. Oliehoek, Christopher Amato

Under this parameterization, in contrast to previous work, the belief over the state and dynamics is a more scalable inference problem.

Reinforcement Learning (RL)

Paper
Add Code

Influence-Augmented Local Simulators: A Scalable Solution for Fast Deep RL in Large Networked Systems

no code implementations • 3 Feb 2022 • Miguel Suau, Jinke He, Matthijs T. J. Spaan, Frans A. Oliehoek

Learning effective policies for real-world problems is still an open challenge for the field of reinforcement learning (RL).

Reinforcement Learning (RL)

Paper
Add Code

Online Planning in POMDPs with Self-Improving Simulators

1 code implementation • 27 Jan 2022 • Jinke He, Miguel Suau, Hendrik Baier, Michael Kaisers, Frans A. Oliehoek

To plan reliably and efficiently while the approximate simulator is learning, we develop a method that adaptively decides which simulator to use for every simulation, based on a statistic that measures the accuracy of the approximate simulator.

Paper
Code

MORAL: Aligning AI with Human Norms through Multi-Objective Reinforced Active Learning

1 code implementation • 30 Dec 2021 • Markus Peschl, Arkady Zgonnikov, Frans A. Oliehoek, Luciano C. Siebert

Inferring reward functions from demonstrations and pairwise preferences are auspicious approaches for aligning Reinforcement Learning (RL) agents with human intentions.

Active Learning Ethics +1

Paper
Code

Multi-Agent MDP Homomorphic Networks

1 code implementation • ICLR 2022 • Elise van der Pol, Herke van Hoof, Frans A. Oliehoek, Max Welling

This paper introduces Multi-Agent MDP Homomorphic Networks, a class of networks that allows distributed execution using only local information, yet is able to share experience between global symmetries in the joint state-action space of cooperative multi-agent systems.

Paper
Code

Difference Rewards Policy Gradients

no code implementations • 21 Dec 2020 • Jacopo Castellini, Sam Devlin, Frans A. Oliehoek, Rahul Savani

Policy gradient methods have become one of the most popular classes of algorithms for multi-agent reinforcement learning.

counterfactual Multi-agent Reinforcement Learning +2

Paper
Add Code

Analog Circuit Design with Dyna-Style Reinforcement Learning

no code implementations • 16 Nov 2020 • Wook Lee, Frans A. Oliehoek

One of the aspects that makes this problem challenging to optimize, is that measuring the performance of candidate configurations with simulation can be computationally expensive, particularly in the post-layout design.

Layout Design Model-based Reinforcement Learning +2

Paper
Add Code

Loss Bounds for Approximate Influence-Based Abstraction

1 code implementation • 3 Nov 2020 • Elena Congeduti, Alexander Mey, Frans A. Oliehoek

Sequential decision making techniques hold great promise to improve the performance of many real-world systems, but computational complexity hampers their principled application.

Decision Making

Paper
Code

Multi-agent active perception with prediction rewards

1 code implementation • NeurIPS 2020 • Mikko Lauri, Frans A. Oliehoek

The accuracy is quantified by a centralized prediction reward determined by a centralized decision-maker who perceives the observations gathered by all agents after the task ends.

Paper
Code

Influence-Augmented Online Planning for Complex Environments

1 code implementation • NeurIPS 2020 • Jinke He, Miguel Suau, Frans A. Oliehoek

In this work, we propose influence-augmented online planning, a principled method to transform a factored simulator of the entire environment into a local simulator that samples only the state variables that are most relevant to the observation and reward of the planning agent and captures the incoming influence from the rest of the environment using machine learning methods.

Paper
Code

Real-Time Resource Allocation for Tracking Systems

no code implementations • 21 Sep 2020 • Yash Satsangi, Shimon Whiteson, Frans A. Oliehoek, Henri Bouma

Automated tracking is key to many computer vision applications.

Paper
Add Code

Exploiting Submodular Value Functions For Scaling Up Active Perception

no code implementations • 21 Sep 2020 • Yash Satsangi, Shimon Whiteson, Frans A. Oliehoek, Matthijs T. J. Spaan

Furthermore, we show that, under certain conditions, including submodularity, the value function computed using greedy PBVI is guaranteed to have bounded error with respect to the optimal value function.

Paper
Add Code

MDP Homomorphic Networks: Group Symmetries in Reinforcement Learning

2 code implementations • NeurIPS 2020 • Elise van der Pol, Daniel E. Worrall, Herke van Hoof, Frans A. Oliehoek, Max Welling

MDP homomorphic networks are neural networks that are equivariant under symmetries in the joint state-action space of an MDP.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Sensor Data for Human Activity Recognition: Feature Representation and Benchmarking

no code implementations • 15 May 2020 • Flávia Alves, Martin Gairing, Frans A. Oliehoek, Thanh-Toan Do

In HAR, the development of Activity Recognition models is dependent upon the data captured by these devices and the methods used to analyse them, which directly affect performance metrics.

Benchmarking Human Activity Recognition

Paper
Add Code

Diversity in Action: General-Sum Multi-Agent Continuous Inverse Optimal Control

no code implementations • 27 Apr 2020 • Christian Muench, Frans A. Oliehoek, Dariu M. Gavrila

Traffic scenarios are inherently interactive.

Paper
Add Code

Mimicking Evolution with Reinforcement Learning

no code implementations • NeurIPS 2021 • João P. Abrantes, Arnaldo J. Abrantes, Frans A. Oliehoek

This work proposes Evolution via Evolutionary Reward (EvER) that allows learning to single-handedly drive the search for policies with increasingly evolutionary fitness by ensuring the alignment of the reward function with the fitness function.

Evolutionary Algorithms reinforcement-learning +1

Paper
Add Code

Decentralized MCTS via Learned Teammate Models

no code implementations • 19 Mar 2020 • Aleksander Czechowski, Frans A. Oliehoek

Decentralized online planning can be an attractive paradigm for cooperative multi-agent systems, due to improved scalability and robustness.

Paper
Add Code

Plannable Approximations to MDP Homomorphisms: Equivariance under Actions

1 code implementation • 27 Feb 2020 • Elise van der Pol, Thomas Kipf, Frans A. Oliehoek, Max Welling

We introduce a contrastive loss function that enforces action equivariance on the learned representations.

Representation Learning

Paper
Code

Influence-aware Memory Architectures for Deep Reinforcement Learning

1 code implementation • 18 Nov 2019 • Miguel Suau, Jinke He, Elena Congeduti, Rolf A. N. Starre, Aleksander Czechowski, Frans A. Oliehoek

Due to its perceptual limitations, an agent may have too little information about the state of the environment to act optimally.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

A Sufficient Statistic for Influence in Structured Multiagent Environments

no code implementations • 22 Jul 2019 • Frans A. Oliehoek, Stefan Witwicki, Leslie P. Kaelbling

In these ways, this paper deepens our understanding of abstraction in a wide range of sequential decision making settings, providing the basis for new approaches and algorithms for a large class of problems.

Decision Making

Paper
Add Code

Learning from Demonstration in the Wild

no code implementations • 8 Nov 2018 • Feryal Behbahani, Kyriacos Shiarlis, Xi Chen, Vitaly Kurin, Sudhanshu Kasewa, Ciprian Stirbu, João Gomes, Supratik Paul, Frans A. Oliehoek, João Messias, Shimon Whiteson

Learning from demonstration (LfD) is useful in settings where hand-coding behaviour or a reward function is impractical.

Paper
Add Code

Beyond Local Nash Equilibria for Adversarial Networks

no code implementations • 18 Jun 2018 • Frans A. Oliehoek, Rahul Savani, Jose Gallego, Elise van der Pol, Roderich Groß

Save for some special cases, current training methods for Generative Adversarial Networks (GANs) are at best guaranteed to converge to a `local Nash equilibrium` (LNE).

Paper
Add Code

Learning in POMDPs with Monte Carlo Tree Search

no code implementations • ICML 2017 • Sammie Katt, Frans A. Oliehoek, Christopher Amato

The POMDP is a powerful framework for reasoning under outcome and information uncertainty, but constructing an accurate POMDP model is difficult.

Paper
Add Code

GANGs: Generative Adversarial Network Games

no code implementations • 2 Dec 2017 • Frans A. Oliehoek, Rahul Savani, Jose Gallego-Posada, Elise van der Pol, Edwin D. de Jong, Roderich Gross

We introduce Generative Adversarial Network Games (GANGs), which explicitly model a finite zero-sum game between a generator ($G$) and classifier ($C$) that use mixed strategies.

Generative Adversarial Network

Paper
Add Code

Structure in the Value Function of Two-Player Zero-Sum Games of Incomplete Information

no code implementations • 22 Jun 2016 • Auke J. Wiggers, Frans A. Oliehoek, Diederik M. Roijers

Zero-sum stochastic games provide a rich model for competitive decision making.

Decision Making

Paper
Add Code

Probably Approximately Correct Greedy Maximization with Efficient Bounds on Information Gain for Sensor Selection

no code implementations • 25 Feb 2016 • Yash Satsangi, Shimon Whiteson, Frans A. Oliehoek

Submodular function maximization finds application in a variety of real-world decision-making problems.

Decision Making

Paper
Add Code

Scaling POMDPs For Selecting Sellers in E-markets-Extended Version

no code implementations • 30 Nov 2015 • Athirai A. Irissappane, Frans A. Oliehoek, Jie Zhang

In multiagent e-marketplaces, buying agents need to select good sellers by querying other buyers (called advisors).

Paper
Add Code

Solving Transition-Independent Multi-agent MDPs with Sparse Interactions (Extended version)

no code implementations • 29 Nov 2015 • Joris Scharpff, Diederik M. Roijers, Frans A. Oliehoek, Matthijs T. J. Spaan, Mathijs M. de Weerdt

In cooperative multi-agent sequential decision making under uncertainty, agents must coordinate to find an optimal joint policy that maximises joint value.

Decision Making Decision Making Under Uncertainty

Paper
Add Code

Exploiting Anonymity in Approximate Linear Programming: Scaling to Large Multiagent MDPs (Extended Version)

no code implementations • 29 Nov 2015 • Philipp Robbel, Frans A. Oliehoek, Mykel J. Kochenderfer

We present an approach to mitigate this limitation for certain types of multiagent systems, exploiting a property that can be thought of as "anonymous influence" in the factored MDP.

Paper
Add Code

Influence-Optimistic Local Values for Multiagent Planning --- Extended Version

no code implementations • 18 Feb 2015 • Frans A. Oliehoek, Matthijs T. J. Spaan, Stefan Witwicki

Recent years have seen the development of methods for multiagent planning under uncertainty that scale to tens or even hundreds of agents.

Benchmarking

Paper
Add Code

Scalable Planning and Learning for Multiagent POMDPs: Extended Version

1 code implementation • 4 Apr 2014 • Christopher Amato, Frans A. Oliehoek

Online, sample-based planning algorithms for POMDPs have shown great promise in scaling to problems with large state spaces, but they become intractable for large action and observation spaces.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Exploiting Agent and Type Independence in Collaborative Graphical Bayesian Games

no code implementations • 1 Aug 2011 • Frans A. Oliehoek, Shimon Whiteson, Matthijs T. J. Spaan

Such problems can be modeled as collaborative Bayesian games in which each agent receives private information in the form of its type.

Decision Making Vocal Bursts Type Prediction

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.