no code implementations • 22 Mar 2024 • Khiem Le, Long Ho, Cuong Do, Danh Le-Phuoc, Kok-Seng Wong
Domain shift is a formidable issue in Machine Learning that causes a model to suffer from performance degradation when tested on unseen domains.
1 code implementation • 12 Dec 2023 • Giang Do, Khiem Le, Quang Pham, TrungTin Nguyen, Thanh-Nam Doan, Bint T. Nguyen, Chenghao Liu, Savitha Ramasamy, XiaoLi Li, Steven Hoi
By routing input tokens to only a few split experts, Sparse Mixture-of-Experts has enabled efficient training of large language models.