Gradient Descent Only Converges to Minimizers: Non-Isolated Critical Points and Invariant Regions Article Swipe
Related Concepts
mathematics
linear subspace
saddle point
lipschitz continuity
differentiable function
lebesgue measure
invariant (physics)
nabla symbol
gradient descent
regular polygon
absolute continuity
eigenvalues and eigenvectors
convex function
upper and lower bounds
maxima and minima
lebesgue integration
mathematical analysis
applied mathematics
combinatorics
pure mathematics
geometry
computer science
omega
mathematical physics
artificial neural network
quantum mechanics
physics
machine learning
Ioannis Panageas
,
Georgios Piliouras
·
YOU?
·
· 2017
· Open Access
·
· DOI: https://doi.org/10.4230/lipics.itcs.2017.2
· OA: W2964204240
YOU?
·
· 2017
· Open Access
·
· DOI: https://doi.org/10.4230/lipics.itcs.2017.2
· OA: W2964204240
Given a twice continuously differentiable cost function f, we prove that the set of initial conditions so that gradient descent converges to saddle points where \nabla^2 f has at least one strictly negative eigenvalue, has (Lebesgue) measure zero, even for cost functions f with non-isolated critical points, answering an open question in [Lee, Simchowitz, Jordan, Recht, COLT 2016]. Moreover, this result extends to forward-invariant convex subspaces, allowing for weak (non-globally Lipschitz) smoothness assumptions. Finally, we produce an upper bound on the allowable step-size.
Related Topics
Finding more related topics…