no code implementations • 23 Feb 2024 • Ahmed Imtiaz Humayun, Randall Balestriero, Richard Baraniuk
Our local complexity measures the density of the so-called 'linear regions' (aka, spline partition regions) that tile the DNN input space, and serves as a utile progress measure for training.
no code implementations • 17 Feb 2024 • Randall Balestriero, Yann Lecun
Despite interpretability of the reconstruction and generation, we identify a misalignment between learning by reconstruction, and learning for perception.
no code implementations • 20 Jan 2024 • Randall Balestriero, Yann Lecun
One fruitful formulation of Deep Networks (DNs) enabling their theoretical study and providing practical guidelines to practitioners relies on Piecewise Affine Splines.
1 code implementation • 3 Jan 2024 • Aarash Feizi, Randall Balestriero, Adriana Romero-Soriano, Reihaneh Rabbany
Any prior knowledge can now be embedded into that metric space independently from the employed DA.
1 code implementation • 4 Dec 2023 • Randall Balestriero, Romain Cosentino, Sarath Shekkizhar
We obtain in closed form (i) the intrinsic dimension in which the Multi-Head Attention embeddings are constrained to exist and (ii) the partition and per-region affine mappings of the per-layer feedforward networks.
no code implementations • 19 Oct 2023 • Ahmed Imtiaz Humayun, Randall Balestriero, Richard Baraniuk
First, we present a novel statistic that encompasses the local complexity (LC) of the DN based on the concentration of linear regions inside arbitrary dimensional neighborhoods around data points.
no code implementations • 25 May 2023 • Ali Siahkoohi, Rudy Morel, Randall Balestriero, Erwan Allys, Grégory Sainton, Taichi Kawamura, Maarten V. de Hoop
This problem is inherently ill-posed and is further challenged by the variety of timescales exhibited by sources.
no code implementations • 24 Apr 2023 • Randall Balestriero, Mark Ibrahim, Vlad Sobal, Ari Morcos, Shashank Shekhar, Tom Goldstein, Florian Bordes, Adrien Bardes, Gregoire Mialon, Yuandong Tian, Avi Schwarzschild, Andrew Gordon Wilson, Jonas Geiping, Quentin Garrido, Pierre Fernandez, Amir Bar, Hamed Pirsiavash, Yann Lecun, Micah Goldblum
Self-supervised learning, dubbed the dark matter of intelligence, is a promising path to advance machine learning.
no code implementations • 11 Apr 2023 • Florian Bordes, Samuel Lavoie, Randall Balestriero, Nicolas Ballas, Pascal Vincent
Self-Supervised Learning (SSL) models rely on a pretext task to learn representations.
1 code implementation • ICCV 2023 • Vivien Cabannes, Leon Bottou, Yann Lecun, Randall Balestriero
Third, it provides a proper active learning framework yielding low-cost solutions to annotate datasets, arguably bringing the gap between theory and practice of active learning that is based on simple-to-answer-by-non-experts queries of semantic relationships between inputs.
1 code implementation • 3 Mar 2023 • Florian Bordes, Randall Balestriero, Pascal Vincent
Joint Embedding Self-Supervised Learning (JE-SSL) has seen rapid developments in recent years, due to its promise to effectively leverage large unlabeled data.
no code implementations • 1 Mar 2023 • Wei-Yin Ko, Daniel D'souza, Karina Nguyen, Randall Balestriero, Sara Hooker
Ensembling multiple Deep Neural Networks (DNNs) is a simple and effective way to improve top-line metrics and to outperform a larger single model.
no code implementations • 1 Mar 2023 • Ravid Shwartz-Ziv, Randall Balestriero, Kenji Kawaguchi, Tim G. J. Rudner, Yann Lecun
Variance-Invariance-Covariance Regularization (VICReg) is a self-supervised learning (SSL) method that has shown promising results on a variety of tasks.
1 code implementation • CVPR 2023 • Ahmed Imtiaz Humayun, Randall Balestriero, Guha Balakrishnan, Richard Baraniuk
In this paper, we go one step further by developing the first provably exact method for computing the geometry of a DN's mapping - including its decision boundary - over a specified region of the data space.
no code implementations • 20 Feb 2023 • Randall Balestriero
Costly, noisy, and over-specialized, labels are to be set aside in favor of unsupervised learning if we hope to learn cheap, reliable, and transferable models.
no code implementations • 6 Feb 2023 • Vivien Cabannes, Bobak T. Kiani, Randall Balestriero, Yann Lecun, Alberto Bietti
Self-supervised learning (SSL) has emerged as a powerful framework to learn representations from raw data without supervision.
no code implementations • 7 Nov 2022 • Vivien Cabannes, Alberto Bietti, Randall Balestriero
Unsupervised representation learning aims at describing raw data efficiently to solve various downstream tasks.
no code implementations • 3 Nov 2022 • Badr Youbi Idrissi, Diane Bouchacourt, Randall Balestriero, Ivan Evtimov, Caner Hazirbas, Nicolas Ballas, Pascal Vincent, Michal Drozdzal, David Lopez-Paz, Mark Ibrahim
Equipped with ImageNet-X, we investigate 2, 200 current recognition models and study the types of mistakes as a function of model's (1) architecture, e. g. transformer vs. convolutional, (2) learning paradigm, e. g. supervised vs. self-supervised, and (3) training procedures, e. g., data augmentation.
1 code implementation • 2 Nov 2022 • Randall Balestriero, Yann Lecun
In this paper we propose the first provable affine constraint enforcement method for DNNs that only requires minimal changes into a given DNN's forward-pass, that is computationally friendly, and that leaves the optimization of the DNN's parameter to be unconstrained, i. e. standard gradient-based method can be employed.
no code implementations • 13 Oct 2022 • Mahmoud Assran, Randall Balestriero, Quentin Duval, Florian Bordes, Ishan Misra, Piotr Bojanowski, Pascal Vincent, Michael Rabbat, Nicolas Ballas
A successful paradigm in representation learning is to perform self-supervised pretraining using tasks based on mini-batch statistics (e. g., SimCLR, VICReg, SwAV, MSN).
no code implementations • 5 Oct 2022 • Quentin Garrido, Randall Balestriero, Laurent Najman, Yann Lecun
Joint-Embedding Self Supervised Learning (JE-SSL) has seen a rapid development, with the emergence of many method variations but only few principled guidelines that would help practitioners to successfully deploy them.
no code implementations • 29 Sep 2022 • Grégoire Mialon, Randall Balestriero, Yann Lecun
Self-Supervised Learning (SSL) methods such as VICReg, Barlow Twins or W-MSE avoid collapse of their joint embedding architectures by constraining or regularizing the covariance matrix of their projector's output.
no code implementations • 29 Sep 2022 • Randall Balestriero, Richard G. Baraniuk
A critically important, ubiquitous, and yet poorly understood ingredient in modern deep networks (DNs) is batch normalization (BN), which centers and normalizes the feature maps.
no code implementations • 29 Sep 2022 • Bobak T. Kiani, Randall Balestriero, Yubei Chen, Seth Lloyd, Yann Lecun
The fundamental goal of self-supervised learning (SSL) is to produce useful representations of data without access to any labels for classifying the data.
no code implementations • 20 Jul 2022 • Ravid Shwartz-Ziv, Randall Balestriero, Yann Lecun
In this paper, we examine self-supervised learning methods, particularly VICReg, to provide an information-theoretical understanding of their construction.
no code implementations • 27 Jun 2022 • Florian Bordes, Randall Balestriero, Quentin Garrido, Adrien Bardes, Pascal Vincent
This is a little vexing, as one would hope that the network layer at which invariance is explicitly enforced by the SSL criterion during training (the last projector layer) should be the one to use for best generalization performance downstream.
no code implementations • 23 May 2022 • Randall Balestriero, Yann Lecun
Self-Supervised Learning (SSL) surmises that inputs and pairwise positive relationships are enough to learn meaningful representations.
no code implementations • 7 Apr 2022 • Vishwanath Saragadam, Randall Balestriero, Ashok Veeraraghavan, Richard G. Baraniuk
DeepTensor is a computationally efficient framework for low-rank decomposition of matrices and tensors using deep generative networks.
no code implementations • 7 Apr 2022 • Randall Balestriero, Leon Bottou, Yann Lecun
The optimal amount of DA or weight decay found from cross-validation leads to disastrous model performances on some classes e. g. on Imagenet with a resnet50, the "barn spider" classification test accuracy falls from $68\%$ to $46\%$ only by introducing random crop DA during training.
1 code implementation • 10 Mar 2022 • Bobak Kiani, Randall Balestriero, Yann Lecun, Seth Lloyd
In learning with recurrent or very deep feed-forward networks, employing unitary matrices in each layer can be very effective at maintaining long-range stability.
no code implementations • 7 Mar 2022 • Rudolf H. Riedi, Randall Balestriero, Richard G. Baraniuk
Building on our earlier work connecting deep networks with continuous piecewise-affine splines, we develop an exact local linear representation of a deep network layer for a family of modern deep networks that includes ConvNets at one end of a spectrum and ResNets, DenseNets, and other networks with skip connections at the other.
1 code implementation • 4 Mar 2022 • Ahmed Imtiaz Humayun, Randall Balestriero, Anastasios Kyrillidis, Richard Baraniuk
We propose to remedy such a scenario by introducing a maximal radius constraint $r$ on the clusters formed by the centroids, i. e., samples from the same cluster should not be more than $2r$ apart in terms of $\ell_2$ distance.
1 code implementation • CVPR 2022 • Ahmed Imtiaz Humayun, Randall Balestriero, Richard Baraniuk
We present Polarity Sampling, a theoretically justified plug-and-play method for controlling the generation quality and diversity of pre-trained deep generative networks DGNs).
Ranked #1 on Image Generation on LSUN Car 512 x 384
no code implementations • 23 Feb 2022 • CJ Barberan, Sina AlEMohammad, Naiming Liu, Randall Balestriero, Richard G. Baraniuk
A key interpretability issue with RNNs is that it is not clear how each hidden state per time step contributes to the decision-making process in a quantitative manner.
no code implementations • 16 Feb 2022 • Randall Balestriero, Ishan Misra, Yann Lecun
We show that for a training loss to be stable under DA sampling, the model's saliency map (gradient of the loss with respect to the model's input) must align with the smallest eigenvector of the sample variance under the considered DA augmentation, hinting at a possible explanation on why models tend to shift their focus from edges to textures.
no code implementations • 16 Feb 2022 • Romain Cosentino, Randall Balestriero, Yanis Bahroun, Anirvan Sengupta, Richard Baraniuk, Behnaam Aazhang
This enables (i) the reduction of intrinsic nuisances associated with the data, reducing the complexity of the clustering task and increasing performances and producing state-of-the-art results, (ii) clustering in the input space of the data, leading to a fully interpretable clustering algorithm, and (iii) the benefit of convergence guarantees.
2 code implementations • 16 Dec 2021 • Florian Bordes, Randall Balestriero, Pascal Vincent
Discovering what is learned by neural networks remains a challenge.
no code implementations • 18 Oct 2021 • Randall Balestriero, Jerome Pesenti, Yann Lecun
The notion of interpolation and extrapolation is fundamental in various fields from deep learning to function approximation.
no code implementations • 15 Oct 2021 • CJ Barberan, Randall Balestriero, Richard G. Baraniuk
Each member of the family is derived from a standard DN architecture by vector quantizing the unit output values and feeding them into a global linear classifier.
1 code implementation • ICLR 2022 • Ahmed Imtiaz Humayun, Randall Balestriero, Richard Baraniuk
Deep Generative Networks (DGNs) are extensively employed in Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), and their variants to approximate the data manifold and distribution.
Ranked #4 on Image Generation on FFHQ 1024 x 1024
no code implementations • 1 Apr 2021 • Randall Balestriero, Richard Baraniuk
Jacobian-vector products (JVPs) form the backbone of many recent developments in Deep Networks (DNs), with applications including faster constrained optimization, regularization with generalization guarantees, and adversarial example sensitivity assessments.
no code implementations • 7 Jan 2021 • Haoran You, Randall Balestriero, Zhihan Lu, Yutong Kou, Huihong Shi, Shunyao Zhang, Shang Wu, Yingyan Lin, Richard Baraniuk
In this paper, we study the importance of pruning in Deep Networks (DNs) and the yin & yang relationship between (1) pruning highly overparametrized DNs that have been trained from random initialization and (2) training small DNs that have been "cleverly" initialized.
no code implementations • 16 Dec 2020 • Romain Cosentino, Randall Balestriero, Yanis Bahroun, Anirvan Sengupta, Richard Baraniuk, Behnaam Aazhang
We design an interpretable clustering algorithm aware of the nonlinear structure of image manifolds.
no code implementations • 14 Dec 2020 • Romain Cosentino, Randall Balestriero
The SMF-DSN enhances the DSN by (i) increasing the diversity of the scattering coefficients and (ii) improves its robustness with respect to non-stationary noise.
2 code implementations • 9 Dec 2020 • Sina AlEMohammad, Randall Balestriero, Zichao Wang, Richard Baraniuk
Kernels derived from deep neural networks (DNNs) in the infinite-width regime provide not only high performance in a range of machine learning tasks but also new theoretical insights into DNN training dynamics and generalization.
no code implementations • NeurIPS 2020 • Randall Balestriero, Sebastien Paris, Richard Baraniuk
Deep Generative Networks (DGNs) with probabilistic modeling of their output and latent space are currently trained via Variational Autoencoders (VAEs).
1 code implementation • 27 Oct 2020 • Sina AlEMohammad, Hossein Babaei, Randall Balestriero, Matt Y. Cheung, Ahmed Imtiaz Humayun, Daniel LeJeune, Naiming Liu, Lorenzo Luzi, Jasper Tan, Zichao Wang, Richard G. Baraniuk
High dimensionality poses many challenges to the use of data, from visualization and interpretation, to prediction and storage for historical preservation.
no code implementations • 20 Sep 2020 • Romain Cosentino, Randall Balestriero, Richard Baraniuk, Behnaam Aazhang
Our regularizations leverage recent advances in the group of transformation learning to enable AEs to better approximate the data manifold without explicitly defining the group underlying the manifold.
no code implementations • 25 Jun 2020 • Lorenzo Luzi, Randall Balestriero, Richard G. Baraniuk
They can be represented in two ways: With an ensemble of networks or with a single network with truncated latent space.
no code implementations • ICLR 2021 • Sina Al-E-Mohammad, Zichao Wang, Randall Balestriero, Richard Baraniuk
The study of deep neural networks (DNNs) in the infinite-width limit, via the so-called neural tangent kernel (NTK) approach, has provided new insights into the dynamics of learning, generalization, and the impact of initialization.
no code implementations • NeurIPS 2020 • Randall Balestriero, Sebastien Paris, Richard G. Baraniuk
Deep Generative Networks (DGNs) with probabilistic modeling of their output and latent space are currently trained via Variational Autoencoders (VAEs).
no code implementations • 13 Jun 2020 • Randall Balestriero, Herve Glotin, Richard G. Baraniuk
We develop an interpretable and learnable Wigner-Ville distribution that produces a super-resolved quadratic signal representation for time-series analysis.
1 code implementation • 21 May 2020 • Randall Balestriero
SymJAX is a symbolic programming version of JAX simplifying graph input/output/updates and providing additional functionalities for general machine learning and deep learning applications.
1 code implementation • 26 Feb 2020 • Randall Balestriero, Sebastien Paris, Richard Baraniuk
We also derive the output probability density mapped onto the generated manifold in terms of the latent space density, which enables the computation of key statistics such as its Shannon entropy.
no code implementations • 25 Sep 2019 • Lorenzo Luzi, Randall Balestriero, Richard Baraniuk
We define a goodness of fit measure for generative networks which captures how well the network can generate the training data, which is necessary to learn the true data distribution.
no code implementations • 28 May 2019 • Daniel LeJeune, Randall Balestriero, Hamid Javadi, Richard G. Baraniuk
Deep (neural) networks have been applied productively in a wide range of supervised and unsupervised learning tasks.
1 code implementation • NeurIPS 2019 • Randall Balestriero, Romain Cosentino, Behnaam Aazhang, Richard Baraniuk
The subdivision process constrains the affine maps on the (exponentially many) power diagram regions to greatly reduce their complexity.
no code implementations • ICLR 2019 • Zichao Wang, Randall Balestriero, Richard Baraniuk
Second, we show that the affine parameter of an RNN corresponds to an input-specific template, from which we can interpret an RNN as performing a simple template matching (matched filtering) given the input.
no code implementations • ICLR 2019 • Randall Balestriero, Richard G. Baraniuk
We show that, under a GMM, piecewise affine, convex nonlinearities like ReLU, absolute value, and max-pooling can be interpreted as solutions to certain natural "hard" VQ inference problems, while sigmoid, hyperbolic tangent, and softmax can be interpreted as solutions to corresponding "soft" VQ inference problems.
no code implementations • ICML 2018 • Randall Balestriero, baraniuk
This implies that a DN constructs a set of signal-dependent, class-specific templates against which the signal is compared via a simple inner product; we explore the links to the classical theory of optimal classification via matched filters and the effects of data memorization.
no code implementations • ICML 2018 • Randall Balestriero, Romain Cosentino, Herve Glotin, Richard Baraniuk
We propose to tackle the problem of end-to-end learning for raw waveform signals by introducing learnable continuous time-frequency atoms.
no code implementations • 17 May 2018 • Randall Balestriero, Richard Baraniuk
For instance, conditioned on the input signal, the output of a MASO DN can be written as a simple affine transformation of the input.
no code implementations • 27 Feb 2018 • Randall Balestriero, Herve Glotin, Richard Baraniuk
Deep Neural Networks (DNNs) provide state-of-the-art solutions in several difficult machine perceptual tasks.
no code implementations • 25 Dec 2017 • Romain Cosentino, Randall Balestriero, Richard Baraniuk, Ankit Patel
In this work, we derive a generic overcomplete frame thresholding scheme based on risk minimization.
no code implementations • 12 Nov 2017 • Randall Balestriero, Vincent Roger, Herve G. Glotin, Richard G. Baraniuk
We exploit a recently derived inversion scheme for arbitrary deep neural networks to develop a new semi-supervised learning framework that applies to a wide range of systems and problems.
no code implementations • 25 Oct 2017 • Randall Balestriero, Richard Baraniuk
Deep Neural Networks (DNNs) are universal function approximators providing state-of- the-art solutions on wide range of applications.
no code implementations • 18 Jul 2017 • Randall Balestriero
The derived approach lead to a formulation of a Deep Oja Network.
no code implementations • 18 Jul 2017 • Randall Balestriero, Herve Glotin
In this paper we propose a scalable version of a state-of-the-art deterministic time-invariant feature extraction approach based on consecutive changes of basis and nonlinearities, namely, the scattering network.
no code implementations • 23 Feb 2017 • Randall Balestriero
NDT is an architecture a la decision tree where each splitting node is an independent multilayer perceptron allowing oblique decision functions or arbritrary nonlinear decision function if more than one layer is used.
no code implementations • 23 Nov 2016 • Randall Balestriero, Behnaam Aazhang
We present a sparse and invariant representation with low asymptotic complexity for robust unsupervised transient and onset zone detection in noisy environments.