Search Results for author: Shiyao Li

Found 7 papers, 2 papers with code

ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation

no code implementations4 Jun 2024 Tianchen Zhao, Tongcheng Fang, Enshu Liu, Wan Rui, Widyadewi Soedarmadji, Shiyao Li, Zinan Lin, Guohao Dai, Shengen Yan, Huazhong Yang, Xuefei Ning, Yu Wang

Diffusion transformers (DiTs) have exhibited remarkable performance in visual generation tasks, such as generating realistic images or videos based on textual instructions.

Quantization Video Generation

Evaluating Quantized Large Language Models

1 code implementation28 Feb 2024 Shiyao Li, Xuefei Ning, Luning Wang, Tengxuan Liu, Xiangsheng Shi, Shengen Yan, Guohao Dai, Huazhong Yang, Yu Wang

Specifically, PTQ can effectively mitigate memory consumption and reduce computational overhead in LLMs.

Quantization

LV-Eval: A Balanced Long-Context Benchmark with 5 Length Levels Up to 256K

1 code implementation6 Feb 2024 Tao Yuan, Xuefei Ning, Dong Zhou, Zhijie Yang, Shiyao Li, Minghui Zhuang, Zheyue Tan, Zhuyu Yao, Dahua Lin, Boxun Li, Guohao Dai, Shengen Yan, Yu Wang

In contrast, the average context lengths of mainstream benchmarks are insufficient (5k-21k), and they suffer from potential knowledge leakage and inaccurate metrics, resulting in biased evaluation.

16k

FlightLLM: Efficient Large Language Model Inference with a Complete Mapping Flow on FPGAs

no code implementations8 Jan 2024 Shulin Zeng, Jun Liu, Guohao Dai, Xinhao Yang, Tianyu Fu, Hongyi Wang, Wenheng Ma, Hanbo Sun, Shiyao Li, Zixiao Huang, Yadong Dai, Jintao Li, Zehao Wang, Ruoyu Zhang, Kairui Wen, Xuefei Ning, Yu Wang

However, existing GPU and transformer-based accelerators cannot efficiently process compressed LLMs, due to the following unresolved challenges: low computational efficiency, underutilized memory bandwidth, and large compilation overheads.

Computational Efficiency Language Modelling +2

Enabling Fast 2-bit LLM on GPUs: Memory Alignment and Asynchronous Dequantization

no code implementations28 Nov 2023 Jinhao Li, Shiyao Li, Jiaming Xu, Shan Huang, Yaoxiu Lian, Jun Liu, Yu Wang, Guohao Dai

Weights are quantized by groups, while the ranges of weights are large in some groups, resulting in large quantization errors and nonnegligible accuracy loss (e. g. >3% for Llama2-7b with 2-bit quantization in GPTQ and Greenbit).

Quantization

Characterizing Datasets for Social Visual Question Answering, and the New TinySocial Dataset

no code implementations8 Oct 2020 Zhanwen Chen, Shiyao Li, Roxanne Rashedi, Xiaoman Zi, Morgan Elrod-Erickson, Bryan Hollis, Angela Maliakal, Xinyu Shen, Simeng Zhao, Maithilee Kunda

Modern social intelligence includes the ability to watch videos and answer questions about social and theory-of-mind-related content, e. g., for a scene in Harry Potter, "Is the father really upset about the boys flying the car?"

Question Answering Visual Question Answering

Cannot find the paper you are looking for? You can Submit a new open access paper.