no code implementations • 13 Apr 2023 • Chris Mingard, Henry Rees, Guillermo Valle-Pérez, Ard A. Louis
The remarkable performance of overparameterized deep neural networks (DNNs) must arise from an interplay between network architecture, training algorithms, and structure in the data.
1 code implementation • 11 Apr 2023 • Jeremy Bernstein, Chris Mingard, Kevin Huang, Navid Azizan, Yisong Yue
Automatic gradient descent trains both fully-connected and convolutional networks out-of-the-box and at ImageNet scale.
no code implementations • 22 Oct 2021 • Yizhang Lou, Chris Mingard, Yoonsoo Nam, Soufiane Hayou
Recent work by Baratin et al. (2021) sheds light on an intriguing pattern that occurs during the training of deep neural networks: some layers align much more with data compared to other layers (where the alignment is defined as the euclidean product of the tangent features matrix and the data labels matrix).
no code implementations • 26 Jun 2020 • Chris Mingard, Guillermo Valle-Pérez, Joar Skalse, Ard A. Louis
Our main findings are that $P_{SGD}(f\mid S)$ correlates remarkably well with $P_B(f\mid S)$ and that $P_B(f\mid S)$ is strongly biased towards low-error and low complexity functions.
no code implementations • 25 Sep 2019 • Chris Mingard, Joar Skalse, Guillermo Valle-Pérez, David Martínez-Rubio, Vladimir Mikulik, Ard A. Louis
Understanding the inductive bias of neural networks is critical to explaining their ability to generalise.