Search Results for author: Khiem Le

Efficiently Assemble Normalization Layers and Regularization for Federated Domain Generalization

Domain shift is a formidable issue in Machine Learning that causes a model to suffer from performance degradation when tested on unseen domains.

Paper
Add Code

By routing input tokens to only a few split experts, Sparse Mixture-of-Experts has enabled efficient training of large language models.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.