Tensor Methods for Nonconvex Optimization using Cubic-quartic regularization models
Abstract
High-order tensor methods for solving both convex and nonconvex optimization problems have recently generated significant research interest, due in part to the natural way in which higher derivatives can be incorporated into adaptive regularization frameworks, leading to algorithms with optimal global rates of convergence and local rates that are faster than Newton's method. On each iteration, to find the next solution approximation, these methods require the unconstrained local minimization of a (potentially nonconvex) multivariate polynomial of degree higher than two, constructed using third-order (or higher) derivative information, and regularized by an appropriate power of the change in the iterates. Developing efficient techniques for the solution of such subproblems is currently, an ongoing topic of research, and this talk addresses this question for the case of the third-order tensor subproblem.
In particular, we propose the CQR algorithmic framework, for minimizing a nonconvex Cubic multivariate polynomial with Quartic Regularisation, by sequentially minimizing a sequence of local quadratic models that also incorporate both simple cubic and quartic terms. The role of the cubic term is to crudely approximate local tensor information, while the quartic one provides model regularization and controls progress. We provide necessary and sufficient optimality conditions that fully characterise the global minimizers of these cubic-quartic models. We then turn these conditions into secular equations that can be solved using nonlinear eigenvalue techniques. We show, using our optimality characterisations, that a CQR algorithmic variant has the optimal-order evaluation complexity of $O(\epsilon^{-3/2})$ when applied to minimizing our quartically-regularised cubic subproblem, which can be further improved in special cases. We propose practical CQR variants that judiciously use local tensor information to construct the local cubic-quartic models. We test these variants numerically and observe them to be competitive with ARC and other subproblem solvers on typical instances and even superior on ill-conditioned subproblems with special structure.
Computing $H^2$-conforming finite element approximations without having to implement $C^1$-elements
Abstract
Fourth-order elliptic problems arise in a variety of applications from thin plates to phase separation to liquid crystals. A conforming Galerkin discretization requires a finite dimensional subspace of $H^2$, which in turn means that conforming finite element subspaces are $C^1$-continuous. In contrast to standard $H^1$-conforming $C^0$ elements, $C^1$ elements, particularly those of high order, are less understood from a theoretical perspective and are not implemented in many existing finite element codes. In this talk, we address the implementation of the elements. In particular, we present algorithms that compute $C^1$ finite element approximations to fourth-order elliptic problems and which only require elements with at most $C^0$-continuity. We also discuss solvers for the resulting subproblems and illustrate the method on a number of representative test problems.
Fast High-Order Finite Element Solvers on Simplices
Abstract
We present new high-order finite elements discretizing the $L^2$ de Rham complex on triangular and tetrahedral meshes. The finite elements discretize the same spaces as usual, but with different basis functions. They allow for fast linear solvers based on static condensation and space decomposition methods.
The new elements build upon the definition of degrees of freedom given by (Demkowicz et al., De Rham diagram for $hp$ finite element spaces. Comput.~Math.~Appl., 39(7-8):29--38, 2000.), and consist of integral moments on a symmetric reference simplex with respect to a numerically computed polynomial basis that is orthogonal in both the $L^2$- and $H(\mathrm{d})$-inner products ($\mathrm{d} \in \{\mathrm{grad}, \mathrm{curl}, \mathrm{div}\}$).
On the reference symmetric simplex, the resulting stiffness matrix has diagonal interior block, and does not couple together the interior and interface degrees of freedom. Thus, on the reference simplex, the Schur complement resulting from elimination of interior degrees of freedom is simply the interface block itself.
This sparsity is not preserved on arbitrary cells mapped from the reference cell. Nevertheless, the interior-interface coupling is weak because it is only induced by the geometric transformation. We devise a preconditioning strategy by neglecting the interior-interface coupling. We precondition the interface Schur complement with the interface block, and simply apply point-Jacobi to precondition the interior block.
The combination of this approach with a space decomposition method on small subdomains constructed around vertices, edges, and faces allows us to efficiently solve the canonical Riesz maps in $H^1$, $H(\mathrm{curl})$, and $H(\mathrm{div})$, at very high order. We empirically demonstrate iteration counts that are robust with respect to the polynomial degree.
16:00
Maths meets Stats
Abstract
Speaker: James Taylor
Title: D-Modules and p-adic Representations
Abstract: The representation theory of finite groups is a beautiful and well-understood subject. However, when one considers more complicated groups things become more interesting, and to classify their representations is often a much harder problem. In this talk, I will introduce the classical theory, the particular groups I am interested in, and explain how one might hope to understand their representations through the use of D-modules - the algebraic incarnation of differential equations.
Speaker: Anthony Webster
Title: An Introduction to Epidemiology and Causal Inference
Abstract: This talk will introduce epidemiology and causal inference from the perspective of a statistician and former theoretical physicist. Despite their studies being underpinned by deep and often complex mathematics, epidemiologists are generally more concerned by seemingly mundane information about the relationships between potential risk factors and disease. Because of this, I will argue that a good epidemiologist with minimal statistical knowledge, will often do better than a highly trained statistician. I will also argue that causal assumptions are a necessary part of epidemiology, should be made more explicitly, and allow a much wider range of causal inferences to be explored. In the process, I will introduce ideas from epidemiology and causal inference such as Mendelian Randomisation and the "do calculus", methodological approaches that will increasingly underpin data-driven population research.
16:00
North meets South
Abstract
Speaker: Cedric Pilatte
Title: Convolution of integer sets: a galaxy of (mostly) open problems
Title: Manifold-Free Riemannian Optimization
Abstract: Optimization problems constrained to a smooth manifold can be solved via the framework of Riemannian optimization. To that end, a geometrical description of the constraining manifold, e.g., tangent spaces, retractions, and cost function gradients, is required. In this talk, we present a novel approach that allows performing approximate Riemannian optimization based on a manifold learning technique, in cases where only a noiseless sample set of the cost function and the manifold’s intrinsic dimension are available.
New College invites applications for this post, which is tenable for a fixed period of three years from 1 October 2024. The person appointed will be expected to undertake their own independent and original academic research in Mathematics. The Fellowship is open to those who have already acquired a first degree, and who at the time of appointment have completed at least two years’ study for a PhD/DPhil.