Search Results for author: Marius Hobbhahn

Found 9 papers, 5 papers with code

The Local Interaction Basis: Identifying Computationally-Relevant and Sparsely Interacting Features in Neural Networks

1 code implementation17 May 2024 Lucius Bushnaq, Stefan Heimersheim, Nicholas Goldowsky-Dill, Dan Braun, Jake Mendel, Kaarel Hänni, Avery Griffin, Jörn Stöhler, Magdalena Wache, Marius Hobbhahn

We present a novel interpretability method that aims to overcome this limitation by transforming the activations of the network into a new basis - the Local Interaction Basis (LIB).

Using Degeneracy in the Loss Landscape for Mechanistic Interpretability

no code implementations17 May 2024 Lucius Bushnaq, Jake Mendel, Stefan Heimersheim, Dan Braun, Nicholas Goldowsky-Dill, Kaarel Hänni, Cindy Wu, Marius Hobbhahn

We propose that if we can represent a neural network in a way that is invariant to reparameterizations that exploit the degeneracies, then this representation is likely to be more interpretable, and we provide some evidence that such a representation is likely to have sparser interactions.

Learning Theory

Large Language Models can Strategically Deceive their Users when Put Under Pressure

1 code implementation9 Nov 2023 Jérémy Scheurer, Mikita Balesni, Marius Hobbhahn

We demonstrate a situation in which Large Language Models, trained to be helpful, harmless, and honest, can display misaligned behavior and strategically deceive their users about this behavior without being instructed to do so.

Management

Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning

no code implementations26 Oct 2022 Pablo Villalobos, Jaime Sevilla, Lennart Heim, Tamay Besiroglu, Marius Hobbhahn, Anson Ho

We analyze the growth of dataset sizes used in machine learning for natural language processing and computer vision, and extrapolate these using two methods; using the historical growth rate and estimating the compute-optimal dataset size for future predicted compute budgets.

Compute Trends Across Three Eras of Machine Learning

1 code implementation11 Feb 2022 Jaime Sevilla, Lennart Heim, Anson Ho, Tamay Besiroglu, Marius Hobbhahn, Pablo Villalobos

Since the advent of Deep Learning in the early 2010s, the scaling of training compute has accelerated, doubling approximately every 6 months.

BIG-bench Machine Learning

Laplace Matching for fast Approximate Inference in Latent Gaussian Models

1 code implementation7 May 2021 Marius Hobbhahn, Philipp Hennig

The method can be thought of as a pre-processing step which can be implemented in <5 lines of code and runs in less than a second.

Bayesian Inference Gaussian Processes +1

Fast Predictive Uncertainty for Classification with Bayesian Deep Networks

1 code implementation2 Mar 2020 Marius Hobbhahn, Agustinus Kristiadi, Philipp Hennig

We reconsider old work (Laplace Bridge) to construct a Dirichlet approximation of this softmax output distribution, which yields an analytic map between Gaussian distributions in logit space and Dirichlet distributions (the conjugate prior to the Categorical distribution) in the output space.

Classification General Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.