1 code implementation • 8 May 2024 • Jinglin Xu, Yijie Guo, Yuxin Peng
We further extend FinePOSE to multi-human pose estimation.
no code implementations • 11 Apr 2024 • Tongzhou Mu, Yijie Guo, Jie Xu, Ankit Goyal, Hao Su, Dieter Fox, Animesh Garg
Encouraged by the remarkable achievements of language and vision foundation models, developing generalist robotic agents through imitation learning, using large demonstration datasets, has become a prominent area of interest in robot learning.
1 code implementation • 26 Jun 2023 • Ankit Goyal, Jie Xu, Yijie Guo, Valts Blukis, Yu-Wei Chao, Dieter Fox
In simulations, we find that a single RVT model works well across 18 RLBench tasks with 249 task variations, achieving 26% higher relative success than the existing state-of-the-art method (PerAct).
Ranked #3 on Robot Manipulation on RLBench
no code implementations • 19 Jul 2022 • Yijie Guo, Qiucheng Wu, Honglak Lee
Meta reinforcement learning (meta-RL) aims to learn a policy solving a set of training tasks simultaneously and quickly adapting to new tasks.
no code implementations • ICLR 2021 • Yijie Guo, Shengyu Feng, Nicolas Le Roux, Ed Chi, Honglak Lee, Minmin Chen
Many real-world applications of reinforcement learning (RL) require the agent to learn from a fixed set of trajectories, without collecting new interactions.
1 code implementation • NeurIPS 2020 • Kuang-Huei Lee, Ian Fischer, Anthony Liu, Yijie Guo, Honglak Lee, John Canny, Sergio Guadarrama
The Predictive Information is the mutual information between the past and the future, I(X_past; X_future).
no code implementations • 25 Sep 2019 • Yijie Guo, Jongwook Choi, Marcin Moczulski, Samy Bengio, Mohammad Norouzi, Honglak Lee
We propose a new method of learning a trajectory-conditioned policy to imitate diverse trajectories from the agent's own past experiences and show that such self-imitation helps avoid myopic behavior and increases the chance of finding a globally optimal solution for hard-exploration tasks, especially when there are misleading rewards.
no code implementations • NeurIPS 2020 • Yijie Guo, Jongwook Choi, Marcin Moczulski, Shengyu Feng, Samy Bengio, Mohammad Norouzi, Honglak Lee
Reinforcement learning with sparse rewards is challenging because an agent can rarely obtain non-zero rewards and hence, gradient-based optimization of parameterized policies can be incremental and slow.
no code implementations • ICLR 2019 • Yijie Guo, Junhyuk Oh, Satinder Singh, Honglak Lee
This paper explores a simple regularizer for reinforcement learning by proposing Generative Adversarial Self-Imitation Learning (GASIL), which encourages the agent to imitate past good trajectories via generative adversarial imitation learning framework.
no code implementations • ICLR 2019 • Jongwook Choi, Yijie Guo, Marcin Moczulski, Junhyuk Oh, Neal Wu, Mohammad Norouzi, Honglak Lee
This paper investigates whether learning contingency-awareness and controllable aspects of an environment can lead to better exploration in reinforcement learning.
Ranked #8 on Atari Games on Atari 2600 Montezuma's Revenge
4 code implementations • ICML 2018 • Junhyuk Oh, Yijie Guo, Satinder Singh, Honglak Lee
This paper proposes Self-Imitation Learning (SIL), a simple off-policy actor-critic algorithm that learns to reproduce the agent's past good decisions.
Ranked #3 on Atari Games on Atari 2600 Atlantis
1 code implementation • CVPR 2018 • Yuting Zhang, Yijie Guo, Yixin Jin, Yijun Luo, Zhiyuan He, Honglak Lee
Deep neural networks can model images with rich latent representations, but they cannot naturally conceptualize structures of object categories in a human-perceptible way.
no code implementations • CVPR 2017 • Yuting Zhang, Luyao Yuan, Yijie Guo, Zhiyuan He, I-An Huang, Honglak Lee
Our training objective encourages better localization on single images, incorporates text phrases in a broad range, and properly pairs image regions with text phrases into positive and negative examples.
2 code implementations • NeurIPS 2016 • Xinchen Yan, Jimei Yang, Ersin Yumer, Yijie Guo, Honglak Lee
We demonstrate the ability of the model in generating 3D volume from a single 2D image with three sets of experiments: (1) learning from single-class objects; (2) learning from multi-class objects and (3) testing on novel object classes.