Local convergence of adaptively regularized tensor methods

Seminar series

Numerical Analysis Group Internal Seminar

Date

Thu, 21 Nov 2024

Time

12:00 - 12:30

Location

Lecture Room 6

Speaker

Karl Welzel

Organisation

University of Oxford

Tensor methods are methods for unconstrained continuous optimization that can incorporate derivative information of up to order p > 2 by computing a step based on the pth-order Taylor expansion at each iteration. The most important among them are regularization-based tensor methods which have been shown to have optimal worst-case iteration complexity of finding an approximate minimizer. Moreover, as one might expect, this worst-case complexity improves as p increases, highlighting the potential advantage of tensor methods. Still, the global complexity results only guarantee pessimistic sublinear rates, so it is natural to ask how local rates depend on the order of the Taylor expansion p. In the case of strongly convex functions and a fixed regularization parameter, the answer is given in a paper by Doikov and Nesterov from 2022: we get pth-order local convergence of function values and gradient norms.
The value of the regularization parameter in their analysis depends on the Lipschitz constant of the pth derivative. Since this constant is not usually known in advance, adaptive regularization methods are more practical. We extend the local convergence results to locally strongly convex functions and fully adaptive methods.
We discuss how for p > 2 it becomes crucial to select the "right" minimizer of the regularized local model in each iteration to ensure all iterations are eventually successful. Counterexamples show that in particular the global minimizer of the subproblem is not suitable in general. If the right minimizer is used, the pth-order local convergence is preserved, otherwise the rate stays superlinear but with an exponent arbitrarily close to one depending on the algorithm parameters.