2 code implementations • 10 Dec 2022 • Haichen Huang, Jiarui Fang, Hongxin Liu, Shenggui Li, Yang You
To reduce GPU memory usage, memory partitioning, and memory offloading have been proposed.
1 code implementation • 28 Oct 2021 • Shenggui Li, Hongxin Liu, Zhengda Bian, Jiarui Fang, Haichen Huang, Yuliang Liu, Boxiang Wang, Yang You
The success of Transformer models has pushed the deep learning model scale to billions of parameters.