no code implementations • 28 Nov 2023 • Brett Barkley, Amy Zhang, David Fridovich-Keil
We observe that utilizing the structure of time reversal in an MDP allows every environment transition experienced by an agent to be transformed into a feasible reverse-time transition, effectively doubling the number of experiences in the environment.