no code implementations • 1 Mar 2024 • Vithursan Thangarasa, Mahmoud Salem, Shreyas Saxena, Kevin Leong, Joel Hestness, Sean Lie
Large language models (LLMs) are typically trained on general source data for various domains, but a recent surge in domain-specific LLMs has shown their potential to outperform general-purpose models in domain-specific tasks (e. g., biomedicine).
Ranked #10 on Question Answering on PubMedQA
2 code implementations • 21 Mar 2023 • Vithursan Thangarasa, Shreyas Saxena, Abhay Gupta, Sean Lie
Recent research has focused on weight sparsity in neural network training to reduce FLOPs, aiming for improved efficiency (test accuracy w. r. t training FLOPs).
no code implementations • 18 Mar 2023 • Vithursan Thangarasa, Abhay Gupta, William Marshall, Tianda Li, Kevin Leong, Dennis Decoste, Sean Lie, Shreyas Saxena
In this work, we show the benefits of using unstructured weight sparsity to train only a subset of weights during pre-training (Sparse Pre-training) and then recover the representational capacity by allowing the zeroed weights to learn (Dense Fine-tuning).
no code implementations • 11 Jun 2021 • Pavan Kumar Anasosalu Vasu, Shreyas Saxena, Oncel Tuzel
When applied to datasets where one or more tasks can have noisy annotations, the proposed method learns to prioritize learning from clean labels for a given task, e. g. reducing surface estimation errors by up to 60%.
no code implementations • 27 May 2021 • Shreyas Saxena, Nidhi Vyas, Dennis Decoste
This setting is widely adopted under the assumption that loss functions for each instance are similar in nature, and hence, a common learning rate can be used.
no code implementations • 18 Feb 2021 • Takuya Higuchi, Shreyas Saxena, Mehrez Souden, Tien Dung Tran, Masood Delfarah, Chandra Dhir
We propose dynamic curriculum learning via data parameters for noise robust keyword spotting.
no code implementations • 20 Sep 2020 • Nidhi Vyas, Shreyas Saxena, Thomas Voice
One-hot labels do not represent soft decision boundaries among concepts, and hence, models trained on them are prone to overfitting.
1 code implementation • NeurIPS 2019 • Shreyas Saxena, Oncel Tuzel, Dennis Decoste
To the best of our knowledge, our work is the first curriculum learning method to show gains on large scale image classification and detection tasks.
no code implementations • 17 Mar 2018 • Syed Ashar Javed, Shreyas Saxena, Vineet Gandhi
Localizing natural language phrases in images is a challenging problem that requires joint understanding of both the textual and visual modalities.
no code implementations • NeurIPS 2016 • Shreyas Saxena, Jakob Verbeek
Despite the success of CNNs, selecting the optimal architecture for a given task remains an open problem.