Seminar series
Date
Fri, 14 Feb 2020
Time
12:00 - 13:00
Location
L4
Speaker
Konstantin Mischenko
Organisation
King Abdullah University of Science and Technology (KAUST)

We show that two rules are sufficient to automate gradient descent: 1) don't increase the stepsize too fast and 2) don't overstep the local curvature. No need for functional values, no line search, no information about the function except for the gradients. By following these rules, you get a method adaptive to the local geometry, with convergence guarantees depending only on smoothness in a neighborhood of a solution. Given that the problem is convex, our method will converge even if the global smoothness constant is infinity. As an illustration, it can minimize arbitrary continuously twice-differentiable convex function. We examine its performance on a range of convex and nonconvex problems, including matrix factorization and training of ResNet-18.

Please contact us with feedback and comments about this page. Last updated on 03 Apr 2022 01:32.