no code implementations • ECCV 2020 • Zhuoning Yuan, Zhishuai Guo, Xiaotian Yu, Xiaoyu Wang, Tianbao Yang
In our experiment, we demonstrate that the proposed frame-work is able to train deep learning models with millions of classes and achieve above 10×speedup compared to existing approaches.
1 code implementation • 30 May 2023 • Quanqi Hu, Zi-Hao Qiu, Zhishuai Guo, Lijun Zhang, Tianbao Yang
In this paper, we consider non-convex multi-block bilevel optimization (MBBO) problems, which involve $m\gg 1$ lower level problems and have important applications in machine learning.
1 code implementation • 26 Oct 2022 • Zhishuai Guo, Rong Jin, Jiebo Luo, Tianbao Yang
To this end, we propose an active-passive decomposition framework that decouples the gradient's components with two types, namely active parts and passive parts, where the active parts depend on local data that are computed with the local model and the passive parts depend on other machines that are communicated/computed based on historical models and samples.
no code implementations • 7 Dec 2021 • Zhishuai Guo, Yi Xu, Wotao Yin, Rong Jin, Tianbao Yang
Although rigorous convergence analysis exists for Adam, they impose specific requirements on the update of the adaptive step size, which are not generic enough to cover many other variants of Adam.
no code implementations • ICLR 2022 • Zhuoning Yuan, Zhishuai Guo, Nitesh Chawla, Tianbao Yang
The key idea of compositional training is to minimize a compositional objective function, where the outer function corresponds to an AUC loss and the inner function represents a gradient descent step for minimizing a traditional loss, e. g., the cross-entropy (CE) loss.
no code implementations • 5 May 2021 • Zhishuai Guo, Quanqi Hu, Lijun Zhang, Tianbao Yang
Although numerous studies have proposed stochastic algorithms for solving these problems, they are limited in two perspectives: (i) their sample complexities are high, which do not match the state-of-the-art result for non-convex stochastic optimization; (ii) their algorithms are tailored to problems with only one lower-level problem.
no code implementations • 30 Apr 2021 • Zhishuai Guo, Yi Xu, Wotao Yin, Rong Jin, Tianbao Yang
Our analysis exhibits that an increasing or large enough "momentum" parameter for the first-order moment used in practice is sufficient to ensure Adam and its many variants converge under a mild boundness condition on the adaptive scaling factor of the step size.
1 code implementation • 9 Feb 2021 • Zhuoning Yuan, Zhishuai Guo, Yi Xu, Yiming Ying, Tianbao Yang
Deep AUC (area under the ROC curve) Maximization (DAM) has attracted much attention recently due to its great potential for imbalanced data classification.
1 code implementation • NeurIPS 2021 • Qi Qi, Zhishuai Guo, Yi Xu, Rong Jin, Tianbao Yang
In this paper, we propose a practical online method for solving a class of distributionally robust optimization (DRO) with non-convex objectives, which has important applications in machine learning for improving the robustness of neural networks.
no code implementations • 12 Jun 2020 • Zhishuai Guo, Yan Yan, Zhuoning Yuan, Tianbao Yang
However, most of the existing algorithms are slow in practice, and their analysis revolves around the convergence to a nearly stationary point. We consider leveraging the Polyak-Lojasiewicz (PL) condition to design faster stochastic algorithms with stronger convergence guarantee.
1 code implementation • ICML 2020 • Zhishuai Guo, Mingrui Liu, Zhuoning Yuan, Li Shen, Wei Liu, Tianbao Yang
In this paper, we study distributed algorithms for large-scale AUC maximization with a deep neural network as a predictive model.
no code implementations • 9 Mar 2020 • Zhishuai Guo, Yan Yan, Tianbao Yang
It remains unclear how these averaging schemes affect the convergence of {\it both optimization error and generalization error} (two equally important components of testing error) for {\bf non-strongly convex objectives, including non-convex problems}.