no code implementations • 23 Mar 2020 • Charles G. Frye, James Simon, Neha S. Wadia, Andrew Ligeralde, Michael R. DeWeese, Kristofer E. Bouchard
Despite the fact that the loss functions of deep neural networks are highly non-convex, gradient-based optimization algorithms converge to approximately the same performance from many random initial points.
no code implementations • 12 Jun 2019 • Charles G. Frye
Understanding of the behavior of algorithms for resolving the optimization problem (hereafter shortened to OP) of optimizing a differentiable loss function (OP1), is enhanced by knowledge of the critical points of that loss function, i. e. the points where the gradient is 0.
no code implementations • 29 Jan 2019 • Charles G. Frye, Neha S. Wadia, Michael R. DeWeese, Kristofer E. Bouchard
Numerically locating the critical points of non-convex surfaces is a long-standing problem central to many fields.