Search Results for author: Shi-Yu Xia

Found 1 papers, 0 papers with code

Exploring Learngene via Stage-wise Weight Sharing for Initializing Variable-sized Models

no code implementations • 25 Apr 2024 • Shi-Yu Xia, Wenxuan Zhu, Xu Yang, Xin Geng

When initializing variable-sized models adapting for different resource constraints, SWS achieves better results while reducing around 20x parameters stored to initialize these models and around 10x pre-training costs, in contrast to the pre-training and fine-tuning approach.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.