no code implementations • ICLR 2022 • Zhikang T. Wang, Masahito Ueda
Despite the empirical success of the deep Q network (DQN) reinforcement learning algorithm and its variants, DQN is still not well understood and it does not guarantee convergence.
1 code implementation • 29 Jun 2021 • Zhikang T. Wang, Masahito Ueda
Despite the empirical success of the deep Q network (DQN) reinforcement learning algorithm and its variants, DQN is still not well understood and it does not guarantee convergence.
1 code implementation • 12 Feb 2020 • Liu Ziyin, Zhikang T. Wang, Masahito Ueda
We also bound the regret of Laprop on a convex problem and show that our bound differs from that of Adam by a key factor, which demonstrates its advantage.
1 code implementation • 21 Oct 2019 • Zhikang T. Wang, Yuto Ashida, Masahito Ueda
We generalize a standard benchmark of reinforcement learning, the classical cartpole balancing problem, to the quantum regime by stabilizing a particle in an unstable potential through measurement and feedback.