Search Results for author: Zhenxun Zhuang

Found 7 papers, 4 papers with code

Adaptive Strategies in Non-convex Optimization

no code implementations • 17 Jun 2023 • Zhenxun Zhuang

An algorithm is said to be adaptive to a certain parameter (of the problem) if it does not need a priori knowledge of such a parameter but performs competitively to those that know it.

Stochastic Optimization

Paper
Add Code

Robustness to Unbounded Smoothness of Generalized SignSGD

no code implementations • 23 Aug 2022 • Michael Crawshaw, Mingrui Liu, Francesco Orabona, Wei zhang, Zhenxun Zhuang

We also compare these algorithms with popular optimizers on a set of deep learning tasks, observing that we can match the performance of Adam while beating the others.

Paper
Add Code

A Communication-Efficient Distributed Gradient Clipping Algorithm for Training Deep Neural Networks

1 code implementation • 10 May 2022 • Mingrui Liu, Zhenxun Zhuang, Yunwei Lei, Chunyang Liao

Gradient clipping is usually employed to address this issue in the single machine setting, but exploring this technique in the distributed setting is still in its infancy: it remains mysterious whether the gradient clipping scheme can take advantage of multiple machines to enjoy parallel speedup.

Federated Learning

Paper
Code

Understanding AdamW through Proximal Methods and Scale-Freeness

1 code implementation • 31 Jan 2022 • Zhenxun Zhuang, Mingrui Liu, Ashok Cutkosky, Francesco Orabona

First, we show how to re-interpret AdamW as an approximation of a proximal gradient method, which takes advantage of the closed-form proximal mapping of the regularizer instead of only utilizing its gradient information as in Adam-$\ell_2$.

Paper
Code

A Second look at Exponential and Cosine Step Sizes: Simplicity, Adaptivity, and Performance

2 code implementations • 12 Feb 2020 • Xiaoyu Li, Zhenxun Zhuang, Francesco Orabona

Moreover, we show the surprising property that these two strategies are \emph{adaptive} to the noise level in the stochastic gradients of PL functions.

Stochastic Optimization

Paper
Code

No-regret Non-convex Online Meta-Learning

no code implementations • 22 Oct 2019 • Zhenxun Zhuang, Yunlong Wang, Kezi Yu, Songtao Lu

The online meta-learning framework is designed for the continual lifelong learning setting.

Meta-Learning

Paper
Add Code

Surrogate Losses for Online Learning of Stepsizes in Stochastic Non-Convex Optimization

1 code implementation • 25 Jan 2019 • Zhenxun Zhuang, Ashok Cutkosky, Francesco Orabona

Stochastic Gradient Descent (SGD) has played a central role in machine learning.

Stochastic Optimization

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.