no code implementations • 30 Apr 2024 • Kristina Levina, Nikolaos Pappas, Athanasios Karapantelakis, Aneta Vulgarakis Feljan, Jendrik Seipp
We compare our new approaches to a baseline reward machine in the Craft domain, where the numeric feature is the agent-to-target distance.