1 code implementation • 5 Sep 2023 • Junming Yang, Xingguo Chen, Shengyuan Wang, Bolei Zhang
Model-based offline reinforcement learning (RL), which builds a supervised transition model with logging dataset to avoid costly interactions with the online environment, has been a promising approach for offline policy optimization.