Search Results for author: Jung Hwan Heo

Found 4 papers, 2 papers with code

Rethinking Channel Dimensions to Isolate Outliers for Low-bit Weight Quantization of Large Language Models

1 code implementation27 Sep 2023 Jung Hwan Heo, Jeonghoon Kim, Beomseok Kwon, Byeongwook Kim, Se Jung Kwon, Dongsoo Lee

Weight-only quantization can be a promising approach, but sub-4 bit quantization remains a challenge due to large-magnitude activation outliers.

Language Modelling Quantization

A Fast Training-Free Compression Framework for Vision Transformers

1 code implementation4 Mar 2023 Jung Hwan Heo, Arash Fayyazi, Mahdi Nazemi, Massoud Pedram

Token pruning has emerged as an effective solution to speed up the inference of large Transformer models.

Sparse Periodic Systolic Dataflow for Lowering Latency and Power Dissipation of Convolutional Neural Network Accelerators

no code implementations30 Jun 2022 Jung Hwan Heo, Arash Fayyazi, Amirhossein Esmaili, Massoud Pedram

This paper introduces the sparse periodic systolic (SPS) dataflow, which advances the state-of-the-art hardware accelerator for supporting lightweight neural networks.

Cannot find the paper you are looking for? You can Submit a new open access paper.