no code implementations • 16 Apr 2024 • Shusheng Xu, Wei Fu, Jiaxuan Gao, Wenjie Ye, Weilin Liu, Zhiyu Mei, Guangju Wang, Chao Yu, Yi Wu
However, in academic benchmarks, state-of-the-art results are often achieved via reward-free methods, such as Direct Preference Optimization (DPO).
1 code implementation • 23 Dec 2023 • Jijia Liu, Chao Yu, Jiaxuan Gao, Yuqing Xie, Qingmin Liao, Yi Wu, Yu Wang
AI agents powered by Large Language Models (LLMs) have made significant advances, enabling them to assist humans in diverse complex tasks and leading to a revolution in human-AI coordination.
1 code implementation • 3 Feb 2023 • Chao Yu, Jiaxuan Gao, Weilin Liu, Botian Xu, Hao Tang, Jiaqi Yang, Yu Wang, Yi Wu
A crucial limitation of this framework is that every policy in the pool is optimized w. r. t.
2 code implementations • 9 Jan 2023 • Chao Yu, Xinyi Yang, Jiaxuan Gao, Jiayu Chen, Yunfei Li, Jijia Liu, Yunfei Xiang, Ruixin Huang, Huazhong Yang, Yi Wu, Yu Wang
Simply waiting for every robot being ready for the next action can be particularly time-inefficient.
Multi-agent Reinforcement Learning reinforcement-learning +1
no code implementations • 12 Oct 2021 • Chao Yu, Xinyi Yang, Jiaxuan Gao, Huazhong Yang, Yu Wang, Yi Wu
In this paper, we extend the state-of-the-art single-agent visual navigation method, Active Neural SLAM (ANS), to the multi-agent setting by introducing a novel RL-based planning module, Multi-agent Spatial Planner (MSP). MSP leverages a transformer-based architecture, Spatial-TeamFormer, which effectively captures spatial relations and intra-agent interactions via hierarchical spatial self-attentions.
16 code implementations • 2 Mar 2021 • Chao Yu, Akash Velu, Eugene Vinitsky, Jiaxuan Gao, Yu Wang, Alexandre Bayen, Yi Wu
This is often due to the belief that PPO is significantly less sample efficient than off-policy methods in multi-agent systems.
Multi-agent Reinforcement Learning reinforcement-learning +3