no code implementations • 25 Feb 2024 • Tam Nguyen, César A. Uribe, Tan M. Nguyen, Richard G. Baraniuk
Motivated by this control framework, we derive a novel class of transformers, PID-controlled Transformer (PIDformer), aimed at improving robustness and mitigating the rank-collapse issue inherent in softmax transformers.
no code implementations • 1 Dec 2023 • Tam Nguyen, Tan M. Nguyen, Richard G. Baraniuk
Transformers have achieved remarkable success in a wide range of natural language processing and computer vision applications.
no code implementations • 6 Nov 2023 • Tuan Nguyen, Tam Nguyen, Vinh Nguyen, Tan M. Nguyen
$p$-Laplacian regularization, rooted in graph and image signal processing, introduces a parameter $p$ to control the regularization effect on these data.
no code implementations • 6 Nov 2023 • Tuan Nguyen, Hirotada Honda, Takashi Sano, Vinh Nguyen, Shugo Nakamura, Tan M. Nguyen
We propose the Kuramoto Graph Neural Network (KuramotoGNN), a novel class of continuous-depth graph neural networks (GNNs) that employs the Kuramoto model to mitigate the over-smoothing phenomenon, in which node features in GNNs become indistinguishable as the number of layers increases.
no code implementations • 11 Jun 2023 • Son Nguyen, Cuong Tran Manh, Kien T. Tran, Tan M. Nguyen, Thu-Trang Nguyen, Kien-Tuan Ngo, Hieu Dinh Vo
To implement this idea in the recommendation process, ARIST combines program analysis (PA), language models (LMs), and several features specialized for the recommendation task which consider the functionality of formal parameters and the positional information of code elements (e. g., variables or method calls) in the given context.
1 code implementation • 16 Oct 2021 • Tam Nguyen, Tan M. Nguyen, Dung D. Le, Duy Khuong Nguyen, Viet-Anh Tran, Richard G. Baraniuk, Nhat Ho, Stanley J. Osher
Inspired by this observation, we propose Transformer with a Mixture of Gaussian Keys (Transformer-MGK), a novel transformer architecture that replaces redundant heads in transformers with a mixture of keys at each head.
1 code implementation • NeurIPS 2021 • Hedi Xia, Vai Suliafu, Hangjie Ji, Tan M. Nguyen, Andrea L. Bertozzi, Stanley J. Osher, Bao Wang
We propose heavy ball neural ordinary differential equations (HBNODEs), leveraging the continuous limit of the classical momentum accelerated gradient descent, to improve neural ODEs (NODEs) training and inference.
no code implementations • NeurIPS 2021 • Tan M. Nguyen, Vai Suliafu, Stanley J. Osher, Long Chen, Bao Wang
For instance, FMMformers achieve an average classification accuracy of $60. 74\%$ over the five Long Range Arena tasks, which is significantly better than the standard transformer's average accuracy of $58. 70\%$.
2 code implementations • NeurIPS 2020 • Tan M. Nguyen, Richard G. Baraniuk, Andrea L. Bertozzi, Stanley J. Osher, Bao Wang
Designing deep neural networks is an art that often involves an expensive search over candidate architectures.
1 code implementation • 24 Feb 2020 • Bao Wang, Tan M. Nguyen, Andrea L. Bertozzi, Richard G. Baraniuk, Stanley J. Osher
Nesterov accelerated gradient (NAG) improves the convergence rate of gradient descent (GD) for convex optimization using a specially designed momentum; however, it accumulates error when an inexact gradient is used (such as in SGD), slowing convergence at best and diverging at worst.
no code implementations • 9 Dec 2019 • Tan M. Nguyen, Animesh Garg, Richard G. Baraniuk, Anima Anandkumar
Continuous Normalizing Flows (CNFs) have emerged as promising deep generative models for a wide range of tasks thanks to their invertibility and exact likelihood estimation.
no code implementations • 25 Sep 2019 • Tan M. Nguyen, Animesh Garg, Richard G. Baraniuk, Anima Anandkumar
Continuous Normalizing Flows (CNFs) have emerged as promising deep generative models for a wide range of tasks thanks to their invertibility and exact likelihood estimation.