1 code implementation • 4 Feb 2024 • Yifu Yuan, Jianye Hao, Yi Ma, Zibin Dong, Hebin Liang, Jinyi Liu, Zhixin Feng, Kai Zhao, Yan Zheng
It is crucial to consider diverse human feedback types and various learning methods in different environments.
no code implementations • 27 Jan 2024 • Zibin Dong, Jianye Hao, Yifu Yuan, Fei Ni, Yitian Wang, Pengyi Li, Yan Zheng
Diffusion planning has been recognized as an effective decision-making paradigm in various domains.
no code implementations • 3 Oct 2023 • Zibin Dong, Yifu Yuan, Jianye Hao, Fei Ni, Yao Mu, Yan Zheng, Yujing Hu, Tangjie Lv, Changjie Fan, Zhipeng Hu
Aligning agent behaviors with diverse human preferences remains a challenging problem in reinforcement learning (RL), owing to the inherent abstractness and mutability of human preferences.