no code implementations • 19 Nov 2023 • Ekaterina Lobacheva, Eduard Pockonechnyy, Maxim Kodryan, Dmitry Vetrov
Inspired by recent research that recommends starting neural networks training with large learning rates (LRs) to achieve the best generalization, we explore this hypothesis in detail.