no code implementations • 31 Jan 2024 • Tim Tse, Isaac Chan, Zhitang Chen
In this work, we propose a novel algorithmic framework for data sharing and coordinated exploration for the purpose of learning more data-efficient and better performing policies under a concurrent reinforcement learning (CRL) setting.
no code implementations • 4 Jun 2022 • Brett Daley, Isaac Chan
Q($\sigma$) is a recently proposed temporal-difference learning method that interpolates between learning from expected backups and sampled backups.