no code implementations • 2 Jun 2023 • Roi Peleg, Teddy Lazebnik, Assaf Hoogi
Existing optimizers predominantly adopt techniques based on the first-order exponential moving average (EMA), which results in noticeable delays that impede the real-time tracking of gradients trend and consequently yield sub-optimal performance.