no code implementations • 9 Jan 2023 • Fengyin Li, Yuqiang Li, Xianyi Wu
Reinforcement learning policy evaluation problems are often modeled as finite or discounted/averaged infinite-horizon MDPs.
reinforcement-learning Reinforcement Learning (RL)