Search Results for author: Michael Goin

Found 4 papers, 2 papers with code

Inducing and Exploiting Activation Sparsity for Fast Inference on Deep Neural Networks

no code implementations • ICML 2020 • Mark Kurtz, Justin Kopinsky, Rati Gelashvili, Alexander Matveev, John Carr, Michael Goin, William Leiserson, Sage Moore, Nir Shavit, Dan Alistarh

In this paper, we present an in-depth analysis of methods for maximizing the sparsity of the activations in a trained neural network, and show that, when coupled with an efficient sparse-input convolution algorithm, we can leverage this sparsity for significant performance gains.

Image Classification

Paper
Add Code

Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment

no code implementations • 6 May 2024 • Abhinav Agarwalla, Abhay Gupta, Alexandre Marques, Shubhra Pandit, Michael Goin, Eldar Kurtic, Kevin Leong, Tuan Nguyen, Mahmoud Salem, Dan Alistarh, Sean Lie, Mark Kurtz

We achieve this for the LLaMA-2 7B model by combining the SparseGPT one-shot pruning method and sparse pretraining of those models on a subset of the SlimPajama dataset mixed with a Python subset of The Stack dataset.

Arithmetic Reasoning Code Generation +2

Paper
Add Code

Sparse Fine-tuning for Inference Acceleration of Large Language Models

2 code implementations • 10 Oct 2023 • Eldar Kurtic, Denis Kuznedelev, Elias Frantar, Michael Goin, Dan Alistarh

While the standard approach is to leverage sparsity for computational reduction, we observe that in the case of memory-bound LLMs sparsity can also be leveraged for reducing memory bandwidth.

Quantization Text Generation +1

2,894

Paper
Code

The Optimal BERT Surgeon: Scalable and Accurate Second-Order Pruning for Large Language Models

2 code implementations • 14 Mar 2022 • Eldar Kurtic, Daniel Campos, Tuan Nguyen, Elias Frantar, Mark Kurtz, Benjamin Fineran, Michael Goin, Dan Alistarh

We perform an in-depth study of the accuracy-compression trade-off for unstructured weight pruning of BERT models.

Quantization

2,894

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.