no code implementations • 10 Oct 2023 • Jacob Chmura, Hasham Burhani, Xiao Qi Shi
We expand on this topic and propose a new intrinsic reward that systemically quantifies exploratory behavior and promotes state coverage by maximizing the information content of a trajectory taken by an agent.
no code implementations • 8 Aug 2023 • Hasham Burhani, Xiao Qi Shi, Jonathan Jaegerman, Daniel Balicki
From our analysis of the aforementioned problems we derive a novel loss function for reinforcement learning and supervised classification.