Search Results for author: George Dahl

Found 6 papers, 1 papers with code

A Loss Curvature Perspective on Training Instability in Deep Learning

no code implementations8 Oct 2021 Justin Gilmer, Behrooz Ghorbani, Ankush Garg, Sneha Kudugunta, Behnam Neyshabur, David Cardoze, George Dahl, Zachary Nado, Orhan Firat

In this work, we study the evolution of the loss Hessian across many classification tasks in order to understand the effect the curvature of the loss has on the training dynamics.

Navigate

Peptide-Spectra Matching from Weak Supervision

no code implementations20 Aug 2018 Samuel S. Schoenholz, Sean Hackett, Laura Deming, Eugene Melamud, Navdeep Jaitly, Fiona McAllister, Jonathon O'Brien, George Dahl, Bryson Bennett, Andrew M. Dai, Daphne Koller

As in many other scientific domains, we face a fundamental problem when using machine learning to identify proteins from mass spectrometry data: large ground truth datasets mapping inputs to correct outputs are extremely difficult to obtain.

On the importance of initialization and momentum in deep learning

no code implementations Proceedings of the 30th International Conference on Machine Learning 2013 Ilya Sutskever, James Martens, George Dahl, Geoffrey Hinton

Deep and recurrent neural networks (DNNs and RNNs respectively) are powerful models that were considered to be almost impossible to train using stochastic gradient descent with momentum.

Second-order methods

Deep Neural Networks for Acoustic Modeling in Speech Recognition

no code implementations Signal Processing Magazine 2012 Geoffrey Hinton, Li Deng, Dong Yu, George Dahl, Abdel-rahman Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara Sainath, Brian Kingsbury

Most current speech recognition systems use hidden Markov models (HMMs) to deal with the temporal variability of speech and Gaussian mixture models to determine how well each state of each HMM fits a frame or a short window of frames of coefficients that represents the acoustic input.

speech-recognition Speech Recognition

Phone Recognition with the Mean-Covariance Restricted Boltzmann Machine

no code implementations NeurIPS 2010 George Dahl, Marc'Aurelio Ranzato, Abdel-rahman Mohamed, Geoffrey E. Hinton

Straightforward application of Deep Belief Nets (DBNs) to acoustic modeling produces a rich distributed representation of speech data that is useful for recognition and yields impressive results on the speaker-independent TIMIT phone recognition task.

Cannot find the paper you are looking for? You can Submit a new open access paper.