no code implementations • 13 Mar 2024 • Hong Hu, Yue M. Lu, Theodor Misiakiewicz
On the other hand, if $p = o(n)$, the number of random features $p$ is the limiting factor and RFRR test error matches the approximation error of the random feature model class (akin to taking $n = \infty$).
no code implementations • 13 Feb 2024 • Burak Çakmak, Yue M. Lu, Manfred Opper
Motivated by the recent application of approximate message passing (AMP) to the analysis of convex optimizations in multi-class classifications [Loureiro, et.
1 code implementation • 7 Feb 2024 • Hugo Cui, Luca Pesce, Yatin Dandi, Florent Krzakala, Yue M. Lu, Lenka Zdeborová, Bruno Loureiro
To our knowledge, our results provides the first tight description of the impact of feature learning in the generalization of two-layer neural networks in the large learning rate regime $\eta=\Theta_{d}(d)$, beyond perturbative finite width corrections of the conjugate and neural tangent kernels.
no code implementations • 27 Oct 2023 • Sofiia Dubova, Yue M. Lu, Benjamin McKenna, Horng-Tzer Yau
Earlier work by various authors showed that, when the columns of $X$ are either uniform on the sphere or standard Gaussian vectors, and when $\ell$ is an integer (the linear regime $\ell = 1$ is particularly well-studied), the bulk eigenvalues of such matrices behave in a simple way: They are asymptotically given by the free convolution of the semicircular and Mar\v{c}enko-Pastur distributions, with relative weights given by expanding $f$ in the Hermite basis.
no code implementations • 30 May 2022 • Lechao Xiao, Hong Hu, Theodor Misiakiewicz, Yue M. Lu, Jeffrey Pennington
As modern machine learning models continue to advance the computational frontier, it has become increasingly important to develop precise estimates for expected performance improvements under different model and data scaling regimes.
no code implementations • 13 May 2022 • Hong Hu, Yue M. Lu
The generalization performance of kernel ridge regression (KRR) exhibits a multi-phased pattern that crucially depends on the scaling relationship between the sample size $n$ and the underlying dimension $d$.
no code implementations • 12 May 2022 • Yue M. Lu, Horng-Tzer Yau
Our work reveals an equivalence principle: the spectrum of the random kernel matrix is asymptotically equivalent to that of a simpler matrix model, constructed as a linear combination of a (shifted) Wishart matrix and an independent matrix sampled from the Gaussian orthogonal ensemble.
no code implementations • 16 Feb 2022 • Burak Çakmak, Yue M. Lu, Manfred Opper
We analyze the dynamics of a random sequential message passing algorithm for approximate inference with large Gaussian latent variable models in a student-teacher scenario.
no code implementations • 15 Feb 2021 • Oussama Dhifallah, Yue M. Lu
Randomly perturbing networks during the training process is a commonly used approach to improving generalization performance.
1 code implementation • 19 Jan 2021 • Yue M. Lu
This paper proposes a new algorithm, named Householder Dice (HD), for simulating dynamics on dense random matrix ensembles with translation-invariant properties.
no code implementations • 6 Jan 2021 • Oussama Dhifallah, Yue M. Lu
Transfer learning seeks to improve the generalization performance of a target task by exploiting the knowledge learned from a related source task.
1 code implementation • 8 Dec 2020 • Antoine Maillard, Florent Krzakala, Yue M. Lu, Lenka Zdeborová
We consider the phase retrieval problem, in which the observer wishes to recover a $n$-dimensional real or complex signal $\mathbf{X}^\star$ from the (possibly noisy) observation of $|\mathbf{\Phi} \mathbf{X}^\star|$, in which $\mathbf{\Phi}$ is a matrix of size $m \times n$.
Information Theory Disordered Systems and Neural Networks Information Theory
no code implementations • NeurIPS 2020 • Benjamin Aubin, Florent Krzakala, Yue M. Lu, Lenka Zdeborová
We consider a commonly studied supervised classification of a synthetic dataset whose labels are generated by feeding a one-layer neural network with random iid inputs.
no code implementations • ICML 2020 • Francesca Mignacco, Florent Krzakala, Yue M. Lu, Lenka Zdeborová
We also illustrate the interpolation peak at low regularization, and analyze the role of the respective sizes of the two clusters.
no code implementations • 13 May 2019 • Luca Saglietti, Yue M. Lu, Carlo Lucibello
In Generalized Linear Estimation (GLE) problems, we seek to estimate a signal that is observed through a linear transform followed by a component-wise, possibly nonlinear and noisy, channel.
1 code implementation • 27 Mar 2019 • Hong Hu, Yue M. Lu
In sparse linear regression, the SLOPE estimator generalizes LASSO by penalizing different coordinates of the estimate according to their magnitudes.
Information Theory Information Theory Statistics Theory Statistics Theory
no code implementations • 25 Sep 2018 • Yuejie Chi, Yue M. Lu, Yuxin Chen
Substantial progress has been made recently on developing provably accurate and efficient algorithms for low-rank matrix factorization via nonconvex optimization.
no code implementations • 26 Jun 2018 • Dror Simon, Jeremias Sulam, Yaniv Romano, Yue M. Lu, Michael Elad
The proposed method adds controlled noise to the input and estimates a sparse representation from the perturbed signal.
no code implementations • 12 Jun 2018 • Laura Balzano, Yuejie Chi, Yue M. Lu
This survey article reviews a variety of classical and recent algorithms for solving this problem with low computational and memory complexities, particularly those applicable in the big data regime with missing data.
no code implementations • NeurIPS 2019 • Chuang Wang, Hong Hu, Yue M. Lu
We present a theoretical analysis of the training process for a single-layer GAN fed by high-dimensional input data.
no code implementations • 17 May 2018 • Chuang Wang, Yonina C. Eldar, Yue M. Lu
In addition to providing asymptotically exact predictions of the dynamic performance of the algorithms, our high-dimensional analysis yields several insights, including an asymptotic equivalence between Oja's method and GROUSE, and a precise scaling relationship linking the amount of missing data to the signal-to-noise ratio.
no code implementations • 8 Dec 2017 • Chuang Wang, Jonathan Mattingly, Yue M. Lu
In addition to characterizing the dynamic performance of online learning algorithms, our asymptotic analysis also provides useful insights.
no code implementations • NeurIPS 2017 • Chuang Wang, Yue M. Lu
As the ambient dimension tends to infinity, and with proper time scaling, we show that the time-varying joint empirical measure of the target feature vector and the estimates provided by the algorithm will converge weakly to a deterministic measured-valued process that can be characterized as the unique solution of a nonlinear PDE.
no code implementations • 21 Feb 2017 • Yue M. Lu, Gen Li
We study a spectral initialization method that serves a key role in recent work on estimating signals in nonconvex settings.
no code implementations • 4 Jun 2016 • Rujie Yin, Tingran Gao, Yue M. Lu, Ingrid Daubechies
We propose an image representation scheme combining the local and nonlocal characterization of patches in an image.
no code implementations • 1 Jan 2016 • Stanley H. Chan, Todd Zickler, Yue M. Lu
We show that Sinkhorn-Knopp is equivalent to an Expectation-Maximization (EM) algorithm of learning a Gaussian mixture model of the image patches.
no code implementations • 27 Dec 2013 • Stanley H. Chan, Todd Zickler, Yue M. Lu
In particular, our error probability bounds show that, at any given sampling ratio, the probability for MCNLM to have a large deviation from the original NLM solution decays exponentially as the size of the image or database grows.