no code implementations • 2 May 2024 • Harshit Dhankar, Kshitij Mishra, Tejas Bodas
When compared with existing RL algorithms that learn the Gittins index, our algorithms have a lower run time, require less storage space (small Q-table size in QGI and smaller replay buffer in DGN), and illustrate better empirical convergence to the Gittins index.