no code implementations • 6 May 2020 • Andrea Franceschetti, Elisa Tosello, Nicola Castaman, Stefano Ghidoni
This paper proposes a detailed and extensive comparison of the Trust Region Policy Optimization and DeepQ-Network with Normalized Advantage Functions with respect to other state of the art algorithms, namely Deep Deterministic Policy Gradient and Vanilla Policy Gradient.