ST-MoE-L 4.1B (fine-tuned)

1 papers with code • 0 benchmarks • 0 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in ST-MoE-L 4.1B (fine-tuned)

No evaluation results yet. Help compare methods by submitting evaluation metrics.

Most implemented papers

Most implemented Social Latest No code

ST-MoE: Designing Stable and Transferable Sparse Expert Models

tensorflow/mesh • • 17 Feb 2022

But advancing the state-of-the-art across a broad set of natural language tasks has been hindered by training instabilities and uncertain quality during fine-tuning.

Paper
Code

ST-MoE-L 4.1B (fine-tuned)

Benchmarks Add a Result

Most implemented papers

ST-MoE: Designing Stable and Transferable Sparse Expert Models

Content

Benchmarks

Add a Result