ST-MoE-L 4.1B (fine-tuned)
1 papers with code • 0 benchmarks • 0 datasets
This task has no description! Would you like to contribute one?
Benchmarks
These leaderboards are used to track progress in ST-MoE-L 4.1B (fine-tuned)
No evaluation results yet. Help compare methods by
submitting
evaluation metrics.
Most implemented papers
ST-MoE: Designing Stable and Transferable Sparse Expert Models
But advancing the state-of-the-art across a broad set of natural language tasks has been hindered by training instabilities and uncertain quality during fine-tuning.