no code implementations • 7 Aug 2023 • Jordan Dotzel, Gang Wu, Andrew Li, Muhammad Umar, Yun Ni, Mohamed S. Abdelfattah, Zhiru Zhang, Liqun Cheng, Martin G. Dixon, Norman P. Jouppi, Quoc V. Le, Sheng Li
With integer models, we increase the accuracy of ResNet-18 on ImageNet by 1. 31% and ResNet-50 by 0. 90% with equivalent model cost over previous methods.
no code implementations • ICCV 2023 • Cheng Fu, Hanxian Huang, Zixuan Jiang, Yun Ni, Lifeng Nai, Gang Wu, Liqun Cheng, Yanqi Zhou, Sheng Li, Andrew Li, Jishen Zhao
One promising way to accelerate transformer training is to reuse small pretrained models to initialize the transformer, as their existing representation power facilitates faster model convergence.
no code implementations • CVPR 2021 • Sheng Li, Mingxing Tan, Ruoming Pang, Andrew Li, Liqun Cheng, Quoc Le, Norman P. Jouppi
On top of our DC accelerator optimized neural architecture search space, we further propose a latency-aware compound scaling (LACS), the first multi-objective compound scaling method optimizing both accuracy and latency.
1 code implementation • 5 Oct 2019 • Pengyu Cheng, Yitong Li, Xinyuan Zhang, Liqun Cheng, David Carlson, Lawrence Carin
The relative importance of global versus local structure for the embeddings is learned automatically.