Assessing Evolutionary Terrain Generation Methods for Curriculum Reinforcement Learning

29 Mar 2022 · David Howard, Josh Kannemeyer, Davide Dolcetti, Humphrey Munn, Nicole Robinson ·

Curriculum learning allows complex tasks to be mastered via incremental progression over `stepping stone' goals towards a final desired behaviour. Typical implementations learn locomotion policies for challenging environments through gradual complexification of a terrain mesh generated through a parameterised noise function. To date, researchers have predominantly generated terrains from a limited range of noise functions, and the effect of the generator on the learning process is underrepresented in the literature. We compare popular noise-based terrain generators to two indirect encodings, CPPN and GAN. To allow direct comparison between both direct and indirect representations, we assess the impact of a range of representation-agnostic MAP-Elites feature descriptors that compute metrics directly from the generated terrain meshes. Next, performance and coverage are assessed when training a humanoid robot in a physics simulator using the PPO algorithm. Results describe key differences between the generators that inform their use in curriculum learning, and present a range of useful feature descriptors for uptake by the community.

PDF Abstract