Search Results for author: Haoran Lian

Scaffold-BPE: Enhancing Byte Pair Encoding with Simple and Effective Scaffold Token Removal

Due to their infrequent appearance in the text corpus, Scaffold Tokens pose a learning imbalance issue for language models.

Paper
Add Code

We first investigate the imbalance of loss on each token positions and develop a reciprocal-law across model scales and training stages.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.