no code implementations • 1 Jul 2022 • Jingda Wu, Wenhui Huang, Niels de Boer, Yanghui Mo, Xiangkun He, Chen Lv
Decisions made by human subjects in a driving simulator are treated as safe demonstrations, which are stored into the replay buffer and then utilized to enhance the training process of RL.