no code implementations • ICML 2020 • Samy Jelassi, Carles Domingo-Enrich, Damien Scieur, Arthur Mensch, Joan Bruna
Data-driven modeling increasingly requires to find a Nash equilibrium in multi-player games, e. g. when training GANs.
1 code implementation • 4 Dec 2023 • Carles Domingo-Enrich, Jiequn Han, Brandon Amos, Joan Bruna, Ricky T. Q. Chen
Our work introduces Stochastic Optimal Control Matching (SOCM), a novel Iterative Diffusion Optimization (IDO) technique for stochastic optimal control that stems from the same philosophy as the conditional score matching loss for diffusion models.
no code implementations • 27 Jun 2023 • Samy Jelassi, Stéphane d'Ascoli, Carles Domingo-Enrich, Yuhuai Wu, Yuanzhi Li, François Charton
We find that relative position embeddings enable length generalization for simple tasks, such as addition: models trained on $5$-digit numbers can perform $15$-digit sums.
no code implementations • 20 Jun 2023 • Vivien Cabannes, Carles Domingo-Enrich
The theory of statistical learning has focused on variational objectives expressed on functions.
Out-of-Distribution Generalization Weakly-supervised Learning
no code implementations • 28 Apr 2023 • Aram-Alexandre Pooladian, Heli Ben-Hamu, Carles Domingo-Enrich, Brandon Amos, Yaron Lipman, Ricky T. Q. Chen
Simulation-free methods for training continuous-time generative models construct probability paths that go between noise distributions and individual data samples.
no code implementations • 23 Feb 2023 • Carles Domingo-Enrich, Aram-Alexandre Pooladian
In this short note, we complement these existing results in the literature by providing an explicit expansion of $\text{KL}(\rho_t^{\text{FR}}\|\pi)$ in terms of $e^{-t}$, where $(\rho_t^{\text{FR}})_{t\geq 0}$ is the FR gradient flow of the KL divergence.
1 code implementation • 14 Jan 2023 • Carles Domingo-Enrich, Raaz Dwivedi, Lester Mackey
To address these shortcomings, we introduce Compress Then Test (CTT), a new framework for high-powered kernel testing based on sample compression.
1 code implementation • 1 Jun 2022 • Carles Domingo-Enrich
When solving finite-sum minimization problems, two common alternatives to stochastic gradient descent (SGD) with theoretical benefits are random reshuffling (SGD-RR) and shuffle-once (SGD-SO), in which functions are sampled in cycles without replacement.
1 code implementation • 27 May 2022 • Carles Domingo-Enrich, Youssef Mroueh
Differential privacy (DP) is the de facto standard for private data release and private machine learning.
1 code implementation • 27 May 2022 • Carles Domingo-Enrich, Yair Schiff, Youssef Mroueh
Learning high-dimensional distributions is often done with explicit likelihood modeling or implicit modeling via minimizing integral probability metrics (IPMs).
no code implementations • 14 Feb 2022 • Carles Domingo-Enrich, Joan Bruna
Min-max optimization problems arise in several key machine learning setups, including adversarial learning and generative modeling.
no code implementations • 27 Dec 2021 • Carles Domingo-Enrich
We construct pairs of distributions $\mu_d, \nu_d$ on $\mathbb{R}^d$ such that the quantity $|\mathbb{E}_{x \sim \mu_d} [F(x)] - \mathbb{E}_{x \sim \nu_d} [F(x)]|$ decreases as $\Omega(1/d^2)$ for some three-layer ReLU network $F$ with polynomial width and weights, while declining exponentially in $d$ if $F$ is any two-layer network with polynomial weights.
no code implementations • ICLR 2022 • Carles Domingo-Enrich, Youssef Mroueh
A well-known line of work (Barron, 1993; Breiman, 1993; Klusowski & Barron, 2018) provides bounds on the width $n$ of a ReLU two-layer neural network needed to approximate a function $f$ over the ball $\mathcal{B}_R(\mathbb{R}^d)$ up to error $\epsilon$, when the Fourier based quantity $C_f = \frac{1}{(2\pi)^{d/2}} \int_{\mathbb{R}^d} \|\xi\|^2 |\hat{f}(\xi)| \ d\xi$ is finite.
no code implementations • 11 Jul 2021 • Carles Domingo-Enrich, Alberto Bietti, Marylou Gabrié, Joan Bruna, Eric Vanden-Eijnden
In the feature-learning regime, this dual formulation justifies using a two time-scale gradient ascent-descent (GDA) training algorithm in which one updates concurrently the particles in the sample space and the neurons in the parameter space of the energy.
no code implementations • NeurIPS 2021 • Carles Domingo-Enrich, Youssef Mroueh
Several works in implicit and explicit generative modeling empirically observed that feature-learning discriminators outperform fixed-kernel discriminators in terms of the sample quality of the models.
1 code implementation • 15 Apr 2021 • Carles Domingo-Enrich, Alberto Bietti, Eric Vanden-Eijnden, Joan Bruna
Energy-based models (EBMs) are a simple yet powerful framework for generative modeling.
no code implementations • ICLR 2021 • Carles Domingo-Enrich, Fabian Pedregosa, Damien Scieur
First, we show that for zero-sum bilinear games the average-case optimal method is the optimal method for the minimization of the Hamiltonian.
no code implementations • NeurIPS 2020 • Carles Domingo-Enrich, Samy Jelassi, Arthur Mensch, Grant Rotskoff, Joan Bruna
Our method identifies mixed equilibria in high dimensions and is demonstrably effective for training mixtures of GANs.
1 code implementation • 29 May 2019 • Carles Domingo Enrich, Samy Jelassi, Carles Domingo-Enrich, Damien Scieur, Arthur Mensch, Joan Bruna
Data-driven modeling increasingly requires to find a Nash equilibrium in multi-player games, e. g. when training GANs.