no code implementations • 16 Apr 2024 • Amirreza Neshaei Moghaddam, Alex Olshevsky, Bahman Gharesifard
We provide the first known algorithm that provably achieves $\varepsilon$-optimality within $\widetilde{\mathcal{O}}(1/\varepsilon)$ function evaluations for the discounted discrete-time LQR problem with unknown parameters, without relying on two-point gradient estimates.