Search Results for author: Luca Pesce

Found 6 papers, 4 papers with code

Repetita Iuvant: Data Repetition Allows SGD to Learn High-Dimensional Multi-Index Functions

no code implementations • 24 May 2024 • Luca Arnaboldi, Yatin Dandi, Florent Krzakala, Luca Pesce, Ludovic Stephan

Here, we investigate the training dynamics of two-layer shallow neural networks trained with gradient-based algorithms, and discuss how they learn pertinent features in multi-index models, that is target functions with low-dimensional relevant directions.

Computational Efficiency

Paper
Add Code

Asymptotics of feature learning in two-layer networks after one gradient-step

1 code implementation • 7 Feb 2024 • Hugo Cui, Luca Pesce, Yatin Dandi, Florent Krzakala, Yue M. Lu, Lenka Zdeborová, Bruno Loureiro

To our knowledge, our results provides the first tight description of the impact of feature learning in the generalization of two-layer neural networks in the large learning rate regime $\eta=\Theta_{d}(d)$, beyond perturbative finite width corrections of the conjugate and neural tangent kernels.

Paper
Code

The Benefits of Reusing Batches for Gradient Descent in Two-Layer Networks: Breaking the Curse of Information and Leap Exponents

no code implementations • 5 Feb 2024 • Yatin Dandi, Emanuele Troiani, Luca Arnaboldi, Luca Pesce, Lenka Zdeborová, Florent Krzakala

In particular, multi-pass GD with finite stepsize is found to overcome the limitations of gradient flow and single-pass GD given by the information exponent (Ben Arous et al., 2021) and leap exponent (Abbe et al., 2023) of the target function.

Paper
Add Code

How Two-Layer Neural Networks Learn, One (Giant) Step at a Time

1 code implementation • 29 May 2023 • Yatin Dandi, Florent Krzakala, Bruno Loureiro, Luca Pesce, Ludovic Stephan

The picture drastically improves over multiple gradient steps: we show that a batch-size of $n = \mathcal{O}(d)$ is indeed enough to learn multiple target directions satisfying a staircase property, where more and more directions can be learned over time.

Paper
Code

Are Gaussian data all you need? Extents and limits of universality in high-dimensional generalized linear estimation

1 code implementation • 17 Feb 2023 • Luca Pesce, Florent Krzakala, Bruno Loureiro, Ludovic Stephan

Motivated by the recent stream of results on the Gaussian universality of the test and training errors in generalized linear estimation, we ask ourselves the question: "when is a single Gaussian enough to characterize the error?".

Paper
Code

Subspace clustering in high-dimensions: Phase transitions & Statistical-to-Computational gap

1 code implementation • 26 May 2022 • Luca Pesce, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

A simple model to study subspace clustering is the high-dimensional $k$-Gaussian mixture model where the cluster means are sparse vectors.

Clustering Vocal Bursts Intensity Prediction

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.