no code implementations • 12 Mar 2024 • Sidak Pal Singh, Bobby He, Thomas Hofmann, Bernhard Schölkopf
We propose a fresh take on understanding the mechanisms of neural networks by analyzing the rich structure of parameters contained within their optimization trajectories.
no code implementations • 3 Dec 2023 • Yuhui Ding, Antonio Orvieto, Bobby He, Thomas Hofmann
Graph neural networks based on iterative one-hop message passing have been shown to struggle in harnessing the information from distant nodes effectively.
1 code implementation • 3 Nov 2023 • Bobby He, Thomas Hofmann
A simple design recipe for deep Transformers is to compose identical building blocks.
no code implementations • NeurIPS 2023 • Lorenzo Noci, Chuning Li, Mufan Bill Li, Bobby He, Thomas Hofmann, Chris Maddison, Daniel M. Roy
Motivated by the success of Transformers, we study the covariance matrix of a modified Softmax-based attention model with skip connections in the proportional limit of infinite-depth-and-width.
no code implementations • 20 Feb 2023 • Bobby He, James Martens, Guodong Zhang, Aleksandar Botev, Andrew Brock, Samuel L Smith, Yee Whye Teh
Skip connections and normalisation layers form two standard architectural components that are ubiquitous for the training of Deep Neural Networks (DNNs), but whose precise roles are poorly understood.
1 code implementation • 22 Feb 2022 • Francisca Vasconcelos, Bobby He, Nalini Singh, Yee Whye Teh
To that end, we study a Bayesian reformulation of INRs, UncertaINR, in the context of computed tomography, and evaluate several Bayesian deep learning implementations in terms of accuracy and calibration.
no code implementations • 22 Oct 2021 • Soufiane Hayou, Bobby He, Gintare Karolina Dziugaite
In the linear model, we show that a PAC-Bayes generalization error bound is controlled by the magnitude of the change in feature alignment between the 'prior' and 'posterior' data.
no code implementations • ICLR 2022 • Bobby He, Mete Ozay
Trained Neural Networks (NNs) can be viewed as data-dependent kernel machines, with predictions determined by the inner product of last-layer representations across inputs, referred to as the feature kernel.
no code implementations • 24 Oct 2020 • Soufiane Hayou, Eugenio Clerico, Bobby He, George Deligiannidis, Arnaud Doucet, Judith Rousseau
Deep ResNet architectures have achieved state of the art performance on many tasks.
3 code implementations • NeurIPS 2020 • Bobby He, Balaji Lakshminarayanan, Yee Whye Teh
We explore the link between deep ensembles and Gaussian processes (GPs) through the lens of the Neural Tangent Kernel (NTK): a recent development in understanding the training dynamics of wide neural networks (NNs).