Search Results for author: Tan M. Nguyen

Found 12 papers, 4 papers with code

PIDformer: Transformer Meets Control Theory

no code implementations • 25 Feb 2024 • Tam Nguyen, César A. Uribe, Tan M. Nguyen, Richard G. Baraniuk

Motivated by this control framework, we derive a novel class of transformers, PID-controlled Transformer (PIDformer), aimed at improving robustness and mitigating the rank-collapse issue inherent in softmax transformers.

Image Segmentation Language Modelling +1

Paper
Add Code

Mitigating Over-smoothing in Transformers via Regularized Nonlocal Functionals

no code implementations • 1 Dec 2023 • Tam Nguyen, Tan M. Nguyen, Richard G. Baraniuk

Transformers have achieved remarkable success in a wide range of natural language processing and computer vision applications.

Image Segmentation Language Modelling +1

Paper
Add Code

p-Laplacian Transformer

no code implementations • 6 Nov 2023 • Tuan Nguyen, Tam Nguyen, Vinh Nguyen, Tan M. Nguyen

$p$-Laplacian regularization, rooted in graph and image signal processing, introduces a parameter $p$ to control the regularization effect on these data.

Paper
Add Code

From Coupled Oscillators to Graph Neural Networks: Reducing Over-smoothing via a Kuramoto Model-based Approach

no code implementations • 6 Nov 2023 • Tuan Nguyen, Hirotada Honda, Takashi Sano, Vinh Nguyen, Shugo Nakamura, Tan M. Nguyen

We propose the Kuramoto Graph Neural Network (KuramotoGNN), a novel class of continuous-depth graph neural networks (GNNs) that employs the Kuramoto model to mitigate the over-smoothing phenomenon, in which node features in GNNs become indistinguishable as the number of layers increases.

Paper
Add Code

ARIST: An Effective API Argument Recommendation Approach

no code implementations • 11 Jun 2023 • Son Nguyen, Cuong Tran Manh, Kien T. Tran, Tan M. Nguyen, Thu-Trang Nguyen, Kien-Tuan Ngo, Hieu Dinh Vo

To implement this idea in the recommendation process, ARIST combines program analysis (PA), language models (LMs), and several features specialized for the recommendation task which consider the functionality of formal parameters and the positional information of code elements (e. g., variables or method calls) in the given context.

Paper
Add Code

Improving Transformers with Probabilistic Attention Keys

1 code implementation • 16 Oct 2021 • Tam Nguyen, Tan M. Nguyen, Dung D. Le, Duy Khuong Nguyen, Viet-Anh Tran, Richard G. Baraniuk, Nhat Ho, Stanley J. Osher

Inspired by this observation, we propose Transformer with a Mixture of Gaussian Keys (Transformer-MGK), a novel transformer architecture that replaces redundant heads in transformers with a mixture of keys at each head.

Language Modelling

Paper
Code

Heavy Ball Neural Ordinary Differential Equations

1 code implementation • NeurIPS 2021 • Hedi Xia, Vai Suliafu, Hangjie Ji, Tan M. Nguyen, Andrea L. Bertozzi, Stanley J. Osher, Bao Wang

We propose heavy ball neural ordinary differential equations (HBNODEs), leveraging the continuous limit of the classical momentum accelerated gradient descent, to improve neural ODEs (NODEs) training and inference.

Image Classification

Paper
Code

FMMformer: Efficient and Flexible Transformer via Decomposed Near-field and Far-field Attention

no code implementations • NeurIPS 2021 • Tan M. Nguyen, Vai Suliafu, Stanley J. Osher, Long Chen, Bao Wang

For instance, FMMformers achieve an average classification accuracy of $60. 74\%$ over the five Long Range Arena tasks, which is significantly better than the standard transformer's average accuracy of $58. 70\%$.

Language Modelling

Paper
Add Code

MomentumRNN: Integrating Momentum into Recurrent Neural Networks

2 code implementations • NeurIPS 2020 • Tan M. Nguyen, Richard G. Baraniuk, Andrea L. Bertozzi, Stanley J. Osher, Bao Wang

Designing deep neural networks is an art that often involves an expensive search over candidate architectures.

Paper
Code

Scheduled Restart Momentum for Accelerated Stochastic Gradient Descent

1 code implementation • 24 Feb 2020 • Bao Wang, Tan M. Nguyen, Andrea L. Bertozzi, Richard G. Baraniuk, Stanley J. Osher

Nesterov accelerated gradient (NAG) improves the convergence rate of gradient descent (GD) for convex optimization using a specially designed momentum; however, it accumulates error when an inexact gradient is used (such as in SGD), slowing convergence at best and diverging at worst.

General Classification Image Classification

Paper
Code

InfoCNF: An Efficient Conditional Continuous Normalizing Flow with Adaptive Solvers

no code implementations • 9 Dec 2019 • Tan M. Nguyen, Animesh Garg, Richard G. Baraniuk, Anima Anandkumar

Continuous Normalizing Flows (CNFs) have emerged as promising deep generative models for a wide range of tasks thanks to their invertibility and exact likelihood estimation.

Conditional Image Generation Time Series +1

Paper
Add Code

InfoCNF: Efficient Conditional Continuous Normalizing Flow Using Adaptive Solvers

no code implementations • 25 Sep 2019 • Tan M. Nguyen, Animesh Garg, Richard G. Baraniuk, Anima Anandkumar

Continuous Normalizing Flows (CNFs) have emerged as promising deep generative models for a wide range of tasks thanks to their invertibility and exact likelihood estimation.

Conditional Image Generation Time Series +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.