1 code implementation • 3 Mar 2022 • Michael Zhang, Nimit S. Sohoni, Hongyang R. Zhang, Chelsea Finn, Christopher Ré
As ERM models can be good spurious attribute predictors, CNC works by (1) using a trained ERM model's outputs to identify samples with the same class but dissimilar spurious features, and (2) training a robust model with contrastive learning to learn similar representations for same-class samples.
no code implementations • 31 Dec 2021 • Nimit S. Sohoni, Maziar Sanjabi, Nicolas Ballas, Aditya Grover, Shaoliang Nie, Hamed Firooz, Christopher Ré
Theoretically, we provide generalization bounds for our approach in terms of the worst-group performance, which scale with respect to both the total number of training points and the number of training points with group labels.
no code implementations • ICCV 2021 • Fait Poms, Vishnu Sarukkai, Ravi Teja Mullapudi, Nimit S. Sohoni, William R. Mark, Deva Ramanan, Kayvon Fatahalian
For machine learning models trained with limited labeled training data, validation stands to become the main bottleneck to reducing overall annotation costs.
1 code implementation • 1 Jul 2021 • Mayee Chen, Karan Goel, Nimit S. Sohoni, Fait Poms, Kayvon Fatahalian, Christopher Ré
If an unlabeled sample from the target distribution is available, along with a labeled sample from a possibly different source distribution, standard approaches such as importance weighting can be applied to estimate performance on the target.
2 code implementations • ICLR 2020 • Tri Dao, Nimit S. Sohoni, Albert Gu, Matthew Eichhorn, Amit Blonder, Megan Leszczynski, Atri Rudra, Christopher Ré
Modern neural network architectures use structured linear transformations, such as low-rank matrices, sparse matrices, permutations, and the Fourier transform, to improve inference speed and reduce memory usage compared to general linear maps.
1 code implementation • NeurIPS 2020 • Nimit S. Sohoni, Jared A. Dunnmon, Geoffrey Angus, Albert Gu, Christopher Ré
As the subclass labels are frequently unavailable, models trained using only the coarser-grained class labels often exhibit highly variable performance across different subclasses.
1 code implementation • 27 Jun 2019 • Oliver Hinder, Aaron Sidford, Nimit S. Sohoni
This function class, which we call the class of smooth quasar-convex functions, is parameterized by a constant $\gamma \in (0, 1]$, where $\gamma = 1$ encompasses the classes of smooth convex and star-convex functions, and smaller values of $\gamma$ indicate that the function can be "more nonconvex."
no code implementations • 24 Apr 2019 • Nimit S. Sohoni, Christopher R. Aberger, Megan Leszczynski, Jian Zhang, Christopher Ré
In this paper we study a fundamental question: How much memory is actually needed to train a neural network?