Search Results for author: Hengyuan Hu

Found 22 papers, 7 papers with code

“Other-Play” for Zero-Shot Coordination

no code implementations • ICML 2020 • Hengyuan Hu, Alexander Peysakhovich, Adam Lerer, Jakob Foerster

We consider the problem of zero-shot coordination - constructing AI agents that can coordinate with novel partners they have not seen before (e. g. humans).

Multi-agent Reinforcement Learning Reinforcement Learning (RL)

Paper
Add Code

Imitation Bootstrapped Reinforcement Learning

no code implementations • 3 Nov 2023 • Hengyuan Hu, Suvir Mirchandani, Dorsa Sadigh

Despite the considerable potential of reinforcement learning (RL), robotic control tasks predominantly rely on imitation learning (IL) due to its better sample efficiency.

Continuous Control Imitation Learning +2

Paper
Add Code

Toward Grounded Commonsense Reasoning

no code implementations • 14 Jun 2023 • Minae Kwon, Hengyuan Hu, Vivek Myers, Siddharth Karamcheti, Anca Dragan, Dorsa Sadigh

We additionally illustrate our approach with a robot on 2 carefully designed surfaces.

Language Modelling

Paper
Add Code

The Update-Equivalence Framework for Decision-Time Planning

no code implementations • 25 Apr 2023 • Samuel Sokota, Gabriele Farina, David J. Wu, Hengyuan Hu, Kevin A. Wang, J. Zico Kolter, Noam Brown

Using this framework, we derive a provably sound search algorithm for fully cooperative games based on mirror descent and a search algorithm for adversarial games based on magnetic mirror descent.

Paper
Add Code

Language Instructed Reinforcement Learning for Human-AI Coordination

no code implementations • 13 Apr 2023 • Hengyuan Hu, Dorsa Sadigh

One of the fundamental quests of AI is to produce agents that coordinate well with humans.

Multi-agent Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Human-AI Coordination via Human-Regularized Search and Learning

no code implementations • 11 Oct 2022 • Hengyuan Hu, David J Wu, Adam Lerer, Jakob Foerster, Noam Brown

First, we show that our method outperforms experts when playing with a group of diverse human players in ad-hoc teams.

Paper
Add Code

K-level Reasoning for Zero-Shot Coordination in Hanabi

no code implementations • NeurIPS 2021 • Brandon Cui, Hengyuan Hu, Luis Pineda, Jakob N. Foerster

The standard problem setting in cooperative multi-agent settings is self-play (SP), where the goal is to train a team of agents that works well together.

Paper
Add Code

Self-Explaining Deviations for Coordination

no code implementations • 13 Jul 2022 • Hengyuan Hu, Samuel Sokota, David Wu, Anton Bakhtin, Andrei Lupu, Brandon Cui, Jakob N. Foerster

Fully cooperative, partially observable multi-agent problems are ubiquitous in the real world.

Paper
Add Code

Modeling Strong and Human-Like Gameplay with KL-Regularized Search

no code implementations • 14 Dec 2021 • Athul Paul Jacob, David J. Wu, Gabriele Farina, Adam Lerer, Hengyuan Hu, Anton Bakhtin, Jacob Andreas, Noam Brown

We consider the task of building strong but human-like policies in multi-agent decision-making problems, given examples of human behavior.

Imitation Learning

Paper
Add Code

Scalable Online Planning via Reinforcement Learning Fine-Tuning

1 code implementation • NeurIPS 2021 • Arnaud Fickinger, Hengyuan Hu, Brandon Amos, Stuart Russell, Noam Brown

Lookahead search has been a critical component of recent AI successes, such as in the games of chess, go, and poker.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

A Fine-Tuning Approach to Belief State Modeling

no code implementations • ICLR 2022 • Samuel Sokota, Hengyuan Hu, David J Wu, J Zico Kolter, Jakob Nicolaus Foerster, Noam Brown

Furthermore, because this specialization occurs after the action or policy has already been decided, BFT does not require the belief model to process it as input.

Paper
Add Code

Learned Belief Search: Efficiently Improving Policies in Partially Observable Settings

no code implementations • 16 Jun 2021 • Hengyuan Hu, Adam Lerer, Noam Brown, Jakob Foerster

Search is an important tool for computing effective policies in single- and multi-agent environments, and has been crucial for achieving superhuman performance in several benchmark fully and partially observable games.

counterfactual

Paper
Add Code

Off-Belief Learning

5 code implementations • 6 Mar 2021 • Hengyuan Hu, Adam Lerer, Brandon Cui, David Wu, Luis Pineda, Noam Brown, Jakob Foerster

Policies learned through self-play may adopt arbitrary conventions and implicitly rely on multi-step reasoning based on fragile assumptions about other agents' actions and thus fail when paired with humans or independently trained agents at test time.

Paper
Code

Ridge Rider: Finding Diverse Solutions by Following Eigenvectors of the Hessian

no code implementations • NeurIPS 2020 • Jack Parker-Holder, Luke Metz, Cinjon Resnick, Hengyuan Hu, Adam Lerer, Alistair Letcher, Alex Peysakhovich, Aldo Pacchiano, Jakob Foerster

In the era of ever decreasing loss functions, SGD and its various offspring have become the go-to optimization tool in machine learning and are a key component of the success of deep neural networks (DNNs).

BIG-bench Machine Learning

Paper
Add Code

"Other-Play" for Zero-Shot Coordination

2 code implementations • 6 Mar 2020 • Hengyuan Hu, Adam Lerer, Alex Peysakhovich, Jakob Foerster

We consider the problem of zero-shot coordination - constructing AI agents that can coordinate with novel partners they have not seen before (e. g. humans).

Multi-agent Reinforcement Learning

Paper
Code

Polygames: Improved Zero Learning

no code implementations • 27 Jan 2020 • Tristan Cazenave, Yen-Chi Chen, Guan-Wei Chen, Shi-Yu Chen, Xian-Dong Chiu, Julien Dehos, Maria Elsa, Qucheng Gong, Hengyuan Hu, Vasil Khalidov, Cheng-Ling Li, Hsin-I Lin, Yu-Jin Lin, Xavier Martinet, Vegard Mella, Jeremy Rapin, Baptiste Roziere, Gabriel Synnaeve, Fabien Teytaud, Olivier Teytaud, Shi-Cheng Ye, Yi-Jun Ye, Shi-Jim Yen, Sergey Zagoruyko

Since DeepMind's AlphaZero, Zero learning quickly became the state-of-the-art method for many board games.

Board Games

Paper
Add Code

Improving Policies via Search in Cooperative Partially Observable Games

10 code implementations • 5 Dec 2019 • Adam Lerer, Hengyuan Hu, Jakob Foerster, Noam Brown

The first one, single-agent search, effectively converts the problem into a single agent setting by making all but one of the agents play according to the agreed-upon policy.

Game of Hanabi

123

Paper
Code

Simplified Action Decoder for Deep Multi-Agent Reinforcement Learning

4 code implementations • ICLR 2020 • Hengyuan Hu, Jakob N. Foerster

Learning to be informative when observed by others is an interesting challenge for Reinforcement Learning (RL): Fundamentally, RL requires agents to explore in order to discover good policies.

Decoder Multi-agent Reinforcement Learning +2

123

Paper
Code

Hierarchical Decision Making by Generating and Following Natural Language Instructions

1 code implementation • NeurIPS 2019 • Hengyuan Hu, Denis Yarats, Qucheng Gong, Yuandong Tian, Mike Lewis

We explore using latent natural language instructions as an expressive and compositional representation of complex actions for hierarchical decision making.

Decision Making

159

Paper
Code

Learning Deep Generative Models With Discrete Latent Variables

no code implementations • ICLR 2018 • Hengyuan Hu, Ruslan Salakhutdinov

There have been numerous recent advancements on learning deep generative models with latent variables thanks to the reparameterization trick that allows to train deep directed models effectively.

Density Estimation

Paper
Add Code

Deep Restricted Boltzmann Networks

no code implementations • 15 Nov 2016 • Hengyuan Hu, Lisheng Gao, Quanbin Ma

The most famous ones among them are deep belief network, which stacks multiple layer-wise pretrained RBMs to form a hybrid model, and deep Boltzmann machine, which allows connections between hidden units to form a multi-layer structure.

Paper
Add Code

Network Trimming: A Data-Driven Neuron Pruning Approach towards Efficient Deep Architectures

7 code implementations • 12 Jul 2016 • Hengyuan Hu, Rui Peng, Yu-Wing Tai, Chi-Keung Tang

We alternate the pruning and retraining to further reduce zero activations in a network.

Efficient Neural Network

406

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.