no code implementations • 10 Oct 2023 • Jacob Chmura, Hasham Burhani, Xiao Qi Shi
We expand on this topic and propose a new intrinsic reward that systemically quantifies exploratory behavior and promotes state coverage by maximizing the information content of a trajectory taken by an agent.