Search Results for author: Changsheng Zhao

Found 9 papers, 1 papers with code

SpinQuant: LLM quantization with learned rotations

no code implementations • 26 May 2024 • Zechun Liu, Changsheng Zhao, Igor Fedorov, Bilge Soran, Dhruv Choudhary, Raghuraman Krishnamoorthi, Vikas Chandra, Yuandong Tian, Tijmen Blankevoort

In this work, we identify a collection of applicable rotation parameterizations that lead to identical outputs in full-precision Transformer architectures, and find that some random rotations lead to much better quantization than others, with an up to 13 points difference in downstream zero-shot reasoning performance.

Paper
Add Code

Basis Selection: Low-Rank Decomposition of Pretrained Large Language Models for Target Applications

no code implementations • 24 May 2024 • Yang Li, Changsheng Zhao, Hyungtak Lee, Ernie Chang, Yangyang Shi, Vikas Chandra

Large language models (LLMs) significantly enhance the performance of various applications, but they are computationally intensive and energy-demanding.

Paper
Add Code

MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases

no code implementations • 22 Feb 2024 • Zechun Liu, Changsheng Zhao, Forrest Iandola, Chen Lai, Yuandong Tian, Igor Fedorov, Yunyang Xiong, Ernie Chang, Yangyang Shi, Raghuraman Krishnamoorthi, Liangzhen Lai, Vikas Chandra

The resultant models, denoted as MobileLLM-LS, demonstrate a further accuracy enhancement of 0. 7%/0. 8% than MobileLLM 125M/350M.

Paper
Add Code

Not All Weights Are Created Equal: Enhancing Energy Efficiency in On-Device Streaming Speech Recognition

no code implementations • 20 Feb 2024 • Yang Li, Yuan Shangguan, Yuhao Wang, Liangzhen Lai, Ernie Chang, Changsheng Zhao, Yangyang Shi, Vikas Chandra

This study delves into how weight parameters in speech recognition models influence the overall power consumption of these models.

speech-recognition Speech Recognition

Paper
Add Code

On The Open Prompt Challenge In Conditional Audio Generation

no code implementations • 1 Nov 2023 • Ernie Chang, Sidd Srinivasan, Mahi Luthra, Pin-Jie Lin, Varun Nagaraja, Forrest Iandola, Zechun Liu, Zhaoheng Ni, Changsheng Zhao, Yangyang Shi, Vikas Chandra

Text-to-audio generation (TTA) produces audio from a text description, learning from pairs of audio samples and hand-annotated text.

Audio Generation

Paper
Add Code

Revisiting Sample Size Determination in Natural Language Understanding

1 code implementation • 1 Jul 2023 • Ernie Chang, Muhammad Hassan Rashid, Pin-Jie Lin, Changsheng Zhao, Vera Demberg, Yangyang Shi, Vikas Chandra

Knowing exactly how many data points need to be labeled to achieve a certain model performance is a hugely beneficial step towards reducing the overall budgets for annotation.

Active Learning Natural Language Understanding

Paper
Code

LLM-QAT: Data-Free Quantization Aware Training for Large Language Models

no code implementations • 29 May 2023 • Zechun Liu, Barlas Oguz, Changsheng Zhao, Ernie Chang, Pierre Stock, Yashar Mehdad, Yangyang Shi, Raghuraman Krishnamoorthi, Vikas Chandra

Several post-training quantization methods have been applied to large language models (LLMs), and have been shown to perform well down to 8-bits.

Data Free Quantization

Paper
Add Code

Hyperparameter-free Continuous Learning for Domain Classification in Natural Language Understanding

no code implementations • NAACL 2021 • Ting Hua, Yilin Shen, Changsheng Zhao, Yen-Chang Hsu, Hongxia Jin

Most existing continual learning approaches suffer from low accuracy and performance fluctuation, especially when the distributions of old and new data are significantly different.

Continual Learning domain classification +1

Paper
Add Code

Automatic Mixed-Precision Quantization Search of BERT

no code implementations • 30 Dec 2021 • Changsheng Zhao, Ting Hua, Yilin Shen, Qian Lou, Hongxia Jin

Knowledge distillation, Weight pruning, and Quantization are known to be the main directions in model compression.

Knowledge Distillation Model Compression +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.