no code implementations • 16 Apr 2024 • Haozheng Fan, Hao Zhou, Guangtai Huang, Parameswaran Raman, Xinwei Fu, Gaurav Gupta, Dhananjay Ram, Yida Wang, Jun Huan
In this paper, we showcase HLAT: a 7 billion parameter decoder-only LLM pre-trained using trn1 instances over 1. 8 trillion tokens.
1 code implementation • 8 Mar 2023 • Cody Hao Yu, Haozheng Fan, Guangtai Huang, Zhen Jia, Yizhi Liu, Jie Wang, Zach Zheng, Yuan Zhou, Haichen Shen, Junru Shao, Mu Li, Yida Wang
In this paper, we present RAF, a deep learning compiler for training.
no code implementations • 26 Nov 2019 • Han Shi, Haozheng Fan, James T. Kwok
We propose the triad decoder, which considers and predicts the three edges involved in a local triad together.