no code implementations • 20 Apr 2024 • Yebo Wu, Li Li, Chunlin Tian, Chengzhong Xu
In order to preserve the feature representation of each block, we decouple the whole training process into two stages: progressive model shrinking and progressive model growing.