no code implementations • 2 Feb 2024 • Tuan Anh Le, Pavel Sountsov, Matthew D. Hoffman, Ben Lee, Brian Patton, Rif A. Saurous
How do we infer a 3D scene from a single image in the presence of corruptions like rain, snow or fog?
no code implementations • NeurIPS 2023 • Du Phan, Matthew D. Hoffman, David Dohan, Sholto Douglas, Tuan Anh Le, Aaron Parisi, Pavel Sountsov, Charles Sutton, Sharad Vikram, Rif A. Saurous
Large language models (LLMs) solve problems more accurately and interpretably when instructed to work out the answer step by step using a ``chain-of-thought'' (CoT) prompt.
1 code implementation • 13 Jul 2023 • Feras A. Saad, Brian J. Patton, Matthew D. Hoffman, Rif A. Saurous, Vikash K. Mansinghka
This paper presents a new approach to automatically discovering accurate models of complex time series data.
no code implementations • 27 Oct 2022 • Matthew D. Hoffman, Tuan Anh Le, Pavel Sountsov, Christopher Suter, Ben Lee, Vikash K. Mansinghka, Rif A. Saurous
The problem of inferring object shape from a single 2D image is underconstrained.
no code implementations • 17 Jun 2022 • Lucas Theis, Tim Salimans, Matthew D. Hoffman, Fabian Mentzer
Unlike modern compression schemes which rely on transform coding and quantization to restrict the transmitted information, DiffC relies on the efficient communication of pixels corrupted by Gaussian noise.
3 code implementations • 29 Apr 2021 • Pavel Izmailov, Sharad Vikram, Matthew D. Hoffman, Andrew Gordon Wilson
The posterior over Bayesian neural network (BNN) parameters is extremely high-dimensional and non-convex.
no code implementations • 6 Nov 2020 • Alexander D'Amour, Katherine Heller, Dan Moldovan, Ben Adlam, Babak Alipanahi, Alex Beutel, Christina Chen, Jonathan Deaton, Jacob Eisenstein, Matthew D. Hoffman, Farhad Hormozdiari, Neil Houlsby, Shaobo Hou, Ghassen Jerfel, Alan Karthikesalingam, Mario Lucic, Yian Ma, Cory McLean, Diana Mincu, Akinori Mitani, Andrea Montanari, Zachary Nado, Vivek Natarajan, Christopher Nielson, Thomas F. Osborne, Rajiv Raman, Kim Ramasamy, Rory Sayres, Jessica Schrouff, Martin Seneviratne, Shannon Sequeira, Harini Suresh, Victor Veitch, Max Vladymyrov, Xuezhi Wang, Kellie Webster, Steve Yadlowsky, Taedong Yun, Xiaohua Zhai, D. Sculley
Predictors returned by underspecified pipelines are often treated as equivalent based on their training domain performance, but we show here that such predictors can behave very differently in deployment domains.
no code implementations • 4 Feb 2020 • Junpeng Lao, Christopher Suter, Ian Langmore, Cyril Chimisov, Ashish Saxena, Pavel Sountsov, Dave Moore, Rif A. Saurous, Matthew D. Hoffman, Joshua V. Dillon
Markov chain Monte Carlo (MCMC) is widely regarded as one of the most important algorithms of the 20th century.
no code implementations • 23 Oct 2019 • Alexey Radul, Brian Patton, Dougal Maclaurin, Matthew D. Hoffman, Rif A. Saurous
We present a general approach to batching arbitrary computations for accelerators such as GPUs.
no code implementations • pproximateinference AABI Symposium 2019 • Matthew D. Hoffman, Yian Ma
Variational inference (VI) and Markov chain Monte Carlo (MCMC) are approximate posterior inference algorithms that are often said to have complementary strengths, with VI being fast but biased and MCMC being slower but asymptotically unbiased.
1 code implementation • ICML 2020 • Maria I. Gorinova, Dave Moore, Matthew D. Hoffman
Probabilistic programming has emerged as a powerful paradigm in statistics, applied science, and machine learning: by decoupling modelling from inference, it promises to allow modellers to directly reason about the processes generating data.
2 code implementations • NeurIPS 2018 • Matthew D. Hoffman, Matthew J. Johnson, Dustin Tran
Deriving conditional and marginal distributions using conjugacy relationships can be time consuming and error prone.
no code implementations • 16 Oct 2018 • Sharad Vikram, Matthew D. Hoffman, Matthew J. Johnson
In variational autoencoders, the prior on the latent codes $z$ is often treated as an afterthought, but the prior shapes the kind of latent representation that the model learns.
12 code implementations • ICLR 2019 • Cheng-Zhi Anna Huang, Ashish Vaswani, Jakob Uszkoreit, Noam Shazeer, Ian Simon, Curtis Hawthorne, Andrew M. Dai, Matthew D. Hoffman, Monica Dinculescu, Douglas Eck
This is impractical for long sequences such as musical compositions since their memory complexity for intermediate relative information is quadratic in the sequence length.
Ranked #3 on Music Modeling on JSB Chorales
18 code implementations • 16 Feb 2018 • Dawen Liang, Rahul G. Krishnan, Matthew D. Hoffman, Tony Jebara
This non-linear probabilistic model enables us to go beyond the limited modeling capacity of linear factor models which still largely dominate collaborative filtering research. We introduce a generative model with multinomial likelihood and use Bayesian inference for parameter estimation.
Ranked #5 on Recommendation Systems on Million Song Dataset
2 code implementations • ICLR 2018 • Daniel Levy, Matthew D. Hoffman, Jascha Sohl-Dickstein
We present a general-purpose method to train Markov chain Monte Carlo kernels, parameterized by deep neural networks, that converge and mix quickly to their target distribution.
no code implementations • ICML 2017 • Matthew D. Hoffman
Deep latent Gaussian models are powerful and popular probabilistic models of high-dimensional data.
no code implementations • 17 Apr 2017 • Ardavan Saeedi, Matthew D. Hoffman, Stephen J. DiVerdi, Asma Ghandeharioun, Matthew J. Johnson, Ryan P. Adams
Professional-grade software applications are powerful but complicated$-$expert users can achieve impressive results, but novices often struggle to complete even basic tasks.
1 code implementation • 13 Apr 2017 • Stephan Mandt, Matthew D. Hoffman, David M. Blei
Specifically, we show how to adjust the tuning parameters of constant SGD to best match the stationary distribution to a posterior, minimizing the Kullback-Leibler divergence between these two distributions.
no code implementations • 13 Jan 2017 • Dustin Tran, Matthew D. Hoffman, Rif A. Saurous, Eugene Brevdo, Kevin Murphy, David M. Blei
By treating inference as a first class citizen, on a par with modeling, we show that probabilistic programming can be as flexible and computationally efficient as traditional deep learning.
no code implementations • 8 Feb 2016 • Stephan Mandt, Matthew D. Hoffman, David M. Blei
With constant learning rates, it is a stochastic process that, after an initial phase of convergence, generates samples from a stationary distribution.
1 code implementation • 23 Sep 2015 • Bob Carpenter, Matthew D. Hoffman, Marcus Brubaker, Daniel Lee, Peter Li, Michael Betancourt
As computational challenges in optimization and statistical inference grow ever harder, algorithms that utilize derivatives are becoming increasingly more important.
Mathematical Software G.1.0; G.1.3; G.1.4; F.2.1
no code implementations • 28 May 2015 • Lucas Theis, Matthew D. Hoffman
However, the algorithm is prone to local optima which can make the quality of the posterior approximation sensitive to the choice of hyperparameters and initialization.
no code implementations • 25 Nov 2014 • Hamid Izadinia, Ali Farhadi, Aaron Hertzmann, Matthew D. Hoffman
This paper proposes direct learning of image classification from user-supplied tags, without filtering.
no code implementations • 7 Nov 2014 • Dawen Liang, Matthew D. Hoffman
Beta process is the standard nonparametric Bayesian prior for latent factor model.
no code implementations • 16 Apr 2014 • Matthew D. Hoffman, David M. Blei
Stochastic variational inference makes it possible to approximate posterior distributions induced by large datasets quickly using stochastic optimization.
1 code implementation • 20 Dec 2013 • Dawen Liang, Matthew D. Hoffman, Gautham J. Mysore
We propose the product-of-filters (PoF) model, a generative model that decomposes audio spectra as sparse linear combinations of "filters" in the log-spectral domain.
8 code implementations • 18 Nov 2011 • Matthew D. Hoffman, Andrew Gelman
Hamiltonian Monte Carlo (HMC) is a Markov chain Monte Carlo (MCMC) algorithm that avoids the random walk behavior and sensitivity to correlated parameters that plague many MCMC methods by taking a series of steps informed by first-order gradient information.