Search Results for author: Xiansheng Chen

Found 1 papers, 0 papers with code

Temporal Scaling Law for Large Language Models

no code implementations27 Apr 2024 Yizhe Xiong, Xiansheng Chen, Xin Ye, Hui Chen, Zijia Lin, Haoran Lian, Jianwei Niu, Guiguang Ding

We first investigate the imbalance of loss on each token positions and develop a reciprocal-law across model scales and training stages.

Cannot find the paper you are looking for? You can Submit a new open access paper.