1 code implementation • 17 Feb 2022 • Leslie N. Smith
This paper describes the principle of "General Cyclical Training" in machine learning, where training starts and ends with "easy training" and the "hard training" happens during the middle epochs.
1 code implementation • 16 Feb 2022 • Leslie N. Smith
In this paper, we introduce a novel cyclical focal loss and demonstrate that it is a more universal loss function than cross-entropy softmax loss or focal loss.
no code implementations • 18 Nov 2020 • Helena E. Liu, Leslie N. Smith
Specifically, we show that by combining semi-supervised learning with a one-stage, single network version of self-training, our FROST methodology trains faster and is more robust to choices for the labeled samples and changes in hyper-parameters.
1 code implementation • 16 Jun 2020 • Leslie N. Smith, Adam Conovaloff
Reaching the performance of fully supervised learning with unlabeled data and only labeling one sample per class might be ideal for deep learning applications.
no code implementations • 8 Apr 2020 • Leslie N. Smith, Adam Conovaloff
One of the greatest obstacles in the adoption of deep neural networks for new applications is that training the network typically requires a large number of manually labeled training samples.
no code implementations • 23 Oct 2019 • Leslie N. Smith
In addition, there are several papers in the literature of adversarial defenses that claim there is a cost for adversarial robustness, or a trade-off between robustness and accuracy but, under this proposed taxonomy, we hypothesis that this is not universal.
28 code implementations • 26 Mar 2018 • Leslie N. Smith
Although deep learning has produced dazzling successes for applications of image, speech, and video processing in the past few years, most trainings are with suboptimal hyper-parameters, requiring unnecessarily long training times.
no code implementations • ICLR 2018 • Leslie N. Smith, Nicholay Topin
In this paper, we show a phenomenon, which we named ``super-convergence'', where residual networks can be trained using an order of magnitude fewer iterations than is used with standard training methods.
10 code implementations • 23 Aug 2017 • Leslie N. Smith, Nicholay Topin
One of the key elements of super-convergence is training with one learning rate cycle and a large maximum learning rate.
no code implementations • 5 Apr 2017 • Leslie N. Smith
This report is targeted to groups who are subject matter experts in their application but deep learning novices.
2 code implementations • 14 Feb 2017 • Leslie N. Smith, Nicholay Topin
We present observations and discussion of previously unreported phenomena discovered while training residual networks.
1 code implementation • 2 Nov 2016 • Leslie N. Smith, Nicholay Topin
Recent research in the deep learning field has produced a plethora of new architectures.
no code implementations • CVPR 2016 • Leslie N. Smith, Emily M. Hand, Timothy Doster
In particular, an untrainable deep network starts as a trainable shallow network and newly added layers are slowly, organically added during training, thereby increasing the network's depth.
50 code implementations • 3 Jun 2015 • Leslie N. Smith
This paper describes a new method for setting the learning rate, named cyclical learning rates, which practically eliminates the need to experimentally find the best values and schedule for the global learning rates.
no code implementations • 6 May 2013 • Leslie N. Smith
The potential of compressive sensing (CS) has spurred great interest in the research community and is a fast growing area of research.