no code implementations • 11 Feb 2024 • Shayan Meshkat Alsadat, Jean-Raphael Gaglione, Daniel Neider, Ufuk Topcu, Zhe Xu
Our method uses Large Language Models (LLM) to obtain high-level domain-specific knowledge using prompt engineering instead of providing the reinforcement learning algorithm directly with the high-level knowledge which requires an expert to encode the automaton.
no code implementations • 27 May 2023 • Jueming Hu, Jean-Raphael Gaglione, Yanze Wang, Zhe Xu, Ufuk Topcu, Yongming Liu
We develop an algorithm called Q-learning with reward machines for stochastic games (QRM-SG), to learn the best-response strategy at Nash equilibrium for each agent.