no code implementations • 7 May 2024 • Xianlei Long, Hui Zhao, Chao Chen, Fuqiang Gu, Qingyi Gu
To address these challenges, this paper presents a hybrid system that incorporates a wide-angle camera, a high-speed search camera, and a galvano-mirror.
2 code implementations • 26 Feb 2024 • Zhihang Yuan, Yuzhang Shang, Yang Zhou, Zhen Dong, Zhe Zhou, Chenhao Xue, Bingzhe Wu, Zhikai Li, Qingyi Gu, Yong Jae Lee, Yan Yan, Beidi Chen, Guangyu Sun, Kurt Keutzer
Our survey stands out from traditional literature reviews by not only summarizing the current state of research but also by introducing a framework based on roofline model for systematic analysis of LLM inference techniques.
no code implementations • 8 Feb 2024 • Zhikai Li, Xuewen Liu, Jing Zhang, Qingyi Gu
In particular, for the former, we introduce a learnable per-channel dual clipping scheme, which is designed to efficiently identify outliers in the unbalanced activations with fine granularity.
1 code implementation • 9 Jan 2024 • Xuewen Liu, Zhikai Li, Junrui Xiao, Qingyi Gu
Specifically, at the calibration sample level, we select calibration samples based on the density and diversity in the latent space, thus facilitating the alignment of their distribution with the overall samples; and at the reconstruction output level, we propose Fine-grained Block Reconstruction, which can align the outputs of the quantized model and the full-precision model at different network granularity.
no code implementations • 11 Oct 2023 • Zhikai Li, Xiaoxuan Liu, Banghua Zhu, Zhen Dong, Qingyi Gu, Kurt Keutzer
Large Language Models (LLMs) have showcased remarkable impacts across a wide spectrum of natural language processing tasks.
no code implementations • 24 May 2023 • Junrui Xiao, Zhikai Li, Lianwei Yang, Qingyi Gu
In this paper, we first argue empirically that the severe performance degradation is mainly caused by the weight oscillation in the binarization training and the information distortion in the activation of ViTs.
no code implementations • 11 May 2023 • Junrui Xiao, Zhikai Li, Lianwei Yang, Qingyi Gu
As emerging hardware begins to support mixed bit-width arithmetic computation, mixed-precision quantization is widely used to reduce the complexity of neural networks.
1 code implementation • ICCV 2023 • Zhikai Li, Junrui Xiao, Lianwei Yang, Qingyi Gu
Post-training quantization (PTQ), which only requires a tiny dataset for calibration without end-to-end retraining, is a light and practical model compression technique.
1 code implementation • 13 Sep 2022 • Zhikai Li, Mengjuan Chen, Junrui Xiao, Qingyi Gu
In this paper, we propose PSAQ-ViT V2, a more accurate and general data-free quantization framework for ViTs, built on top of PSAQ-ViT.
1 code implementation • ICCV 2023 • Zhikai Li, Qingyi Gu
In this paper, we propose I-ViT, an integer-only quantization scheme for ViTs, to enable ViTs to perform the entire computational graph of inference with integer arithmetic and bit-shifting, and without any floating-point arithmetic.
1 code implementation • 4 Mar 2022 • Zhikai Li, Liping Ma, Mengjuan Chen, Junrui Xiao, Qingyi Gu
The above insights guide us to design a relative value metric to optimize the Gaussian noise to approximate the real images, which are then utilized to calibrate the quantization parameters.
1 code implementation • ECCV 2020 • Yiming Hu, Yuding Liang, Zichao Guo, Ruosi Wan, Xiangyu Zhang, Yichen Wei, Qingyi Gu, Jian Sun
Comprehensive experiments show that ABS can dramatically enhance existing NAS approaches by providing a promising shrunk search space.
no code implementations • 27 Feb 2019 • Yiming Hu, Siyang Sun, Jianquan Li, Jiagang Zhu, Xingang Wang, Qingyi Gu
Particularly, we introduce an additional loss to encode the differences in the feature and semantic distributions within feature maps between the baseline model and the pruned one.
no code implementations • 27 Feb 2019 • Yiming Hu, Jianquan Li, Xianlei Long, Shenhua Hu, Jiagang Zhu, Xingang Wang, Qingyi Gu
Deep neural networks (DNNs) have achieved great success in a wide range of computer vision areas, but the applications to mobile devices is limited due to their high storage and computational cost.
no code implementations • 29 May 2018 • Yiming Hu, Siyang Sun, Jianquan Li, Xingang Wang, Qingyi Gu
In order to accelerate the selection process, the proposed method formulates it as a search problem, which can be solved efficiently by genetic algorithm.