Search Results for author: Felix Dangel

Despite their simple intuition, convolutions are more tedious to analyze than dense layers, which complicates the generalization of theoretical and algorithmic ideas.

Tensor Networks

Paper
Add Code

ViViT: Curvature access through the generalized Gauss-Newton's low-rank structure

3 code implementations • 4 Jun 2021 • Felix Dangel, Lukas Tatzel, Philipp Hennig

Curvature in form of the Hessian or its generalized Gauss-Newton (GGN) approximation is valuable for algorithms that rely on a local model for the loss to train, compress, or explain deep networks.

Paper
Code

Cockpit: A Practical Debugging Tool for the Training of Deep Neural Networks

2 code implementations • NeurIPS 2021 • Frank Schneider, Felix Dangel, Philipp Hennig

When engineers train deep learning models, they are very much 'flying blind'.

470

Paper
Code

BackPACK: Packing more into backprop

1 code implementation • ICLR 2020 • Felix Dangel, Frederik Kunstner, Philipp Hennig

Automatic differentiation frameworks are optimized for exactly one thing: computing the average mini-batch gradient.

544

Paper
Code

Modular Block-diagonal Curvature Approximations for Feedforward Architectures

1 code implementation • 5 Feb 2019 • Felix Dangel, Stefan Harmeling, Philipp Hennig

We propose a modular extension of backpropagation for the computation of block-diagonal approximations to various curvature matrices of the training objective (in particular, the Hessian, generalized Gauss-Newton, and positive-curvature Hessian).

BIG-bench Machine Learning

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.