Forthcoming events in this series
Cubic-quartic regularization models for solving polynomial subproblems in third-order tensor methods
Abstract
High-order tensor methods for solving both convex and nonconvex optimization problems have recently generated significant research interest, due in part to the natural way in which higher derivatives can be incorporated into adaptive regularization frameworks, leading to algorithms with optimal global rates of convergence and local rates that are faster than Newton's method. On each iteration, to find the next solution approximation, these methods require the unconstrained local minimization of a (potentially nonconvex) multivariate polynomial of degree higher than two, constructed using third-order (or higher) derivative information, and regularized by an appropriate power of the change in the iterates. Developing efficient techniques for the solution of such subproblems is currently, an ongoing topic of research, and this talk addresses this question for the case of the third-order tensor subproblem. In particular, we propose the CQR algorithmic framework, for minimizing a nonconvex Cubic multivariate polynomial with Quartic Regularisation, by sequentially minimizing a sequence of local quadratic models that also incorporate both simple cubic and quartic terms.
The role of the cubic term is to crudely approximate local tensor information, while the quartic one provides model regularization and controls progress. We provide necessary and sufficient optimality conditions that fully characterise the global minimizers of these cubic-quartic models. We then turn these conditions into secular equations that can be solved using nonlinear eigenvalue techniques. We show, using our optimality characterisations, that a CQR algorithmic variant has the optimal-order evaluation complexity of $O(\epsilon^{-3/2})$ when applied to minimizing our quartically-regularised cubic subproblem, which can be further improved in special cases. We propose practical CQR variants that judiciously use local tensor information to construct the local cubic-quartic models. We test these variants numerically and observe them to be competitive with ARC and other subproblem solvers on typical instances and even superior on ill-conditioned subproblems with special structure.
Reducing acquisition time and radiation damage: data-driven subsampling for spectromicroscopy
Abstract
Spectro-microscopy is an experimental technique with great potential to science challenges such as the observation of changes over time in energy materials or environmental samples and investigations of the chemical state in biological samples. However, its application is often limited by factors like long acquisition times and radiation damage. We present two measurement strategies that significantly reduce experiment times and applied radiation doses. These strategies involve acquiring only a small subset of all possible measurements and then completing the full data matrix from the sampled measurements. The methods are data-driven, utilizing spectral and spatial importance subsampling distributions to select the most informative measurements. Specifically, we use data-driven leverage scores and adaptive randomized pivoting techniques. We explore raster importance sampling combined with the LoopASD completion algorithm, as well as CUR-based sampling where the CUR approximation also serves as the completion method. Additionally, we propose ideas to make the CUR-based approach adaptive. As a result, capturing as little as 4–6% of the measurements is sufficient to recover the same information as a conventional full scan.
Low-rank approximation of parameter-dependent matrices via CUR decomposition
Abstract
Low-rank approximation of parameter-dependent matrices A(t) is an important task in the computational sciences, with applications in areas such as dynamical systems and the compression of series of images. In this talk, we introduce AdaCUR, an efficient randomised algorithm for computing low-rank approximations of parameter-dependent matrices using the CUR decomposition. The key idea of our approach is the ability to reuse column and row indices for nearby parameter values, improving efficiency. The resulting algorithm is rank-adaptive, provides error control, and has complexity that compares favourably with existing methods. This is joint work with Yuji Nakatsukasa.
Control of multistable structures with shape optimization
Abstract
Shape optimization is a rich field at the intersection of analysis, optimization, and engineering. It seeks to determine the optimal geometry of structures to minimize performance objectives, subject to physical constraints—often modeled by Partial Differential Equations (PDEs). Traditional approaches commonly assume that these constraints admit a unique solution for each candidate shape, implying a single-valued shape-to-solution map. However, many real-world structures exhibit multistability, where multiple stable configurations exist under identical physical conditions.
This research departs from the single-solution paradigm by investigating shape optimization in the presence of multiple solutions to the same PDE constraints. Focusing on a neo-Hookean hyperelastic model, we formulate an optimization problem aimed at controlling the energy jump between distinct solutions. Drawing on bifurcation theory, we develop a theoretical framework that interprets these solutions as continuous branches parameterized by shape variations. Building on this foundation, we implement a numerical optimization strategy and present numerical results that demonstrate the effectiveness of our approach.
Fast solvers for high-order finite element discretizations of the de Rham complex
Abstract
Many applications in electromagnetism, magnetohydrodynamics, and pour media flow are well-posed in spaces from the 3D de Rham complex involving $H^1$, $H(curl)$, $H(div)$, and $L^2$. Discretizing these spaces with the usual conforming finite element spaces typically leads to discrete problems that are both structure-preserving and uniformly stable with respect to the mesh size and polynomial degree. Robust preconditioners/solvers usually require the inversion of subproblems or auxiliary problems on vertex, edge, or face patches of elements. For high-order discretizations, the cost of inverting these patch problems scales like $\mathcal{O}(p^9)$ and is thus prohibitively expensive. We propose a new set of basis functions for each of the spaces in the discrete de Rham complex that reduce the cost of the patch problems to $\mathcal{O}(p^6)$ complexity. By taking advantage of additional properties of the new basis, we propose further computationally cheaper variants of existing preconditioners. Various numerical examples demonstrate the performance of the solvers.
Computing complex resonances with AAA
Abstract
A beautiful example of a nonlinear eigenvalue problem is the determination of complex eigenvalues for wave scattering. This talk will show how nicely this can be done by applying AAA rational approximation to a scalarized resolvent sampled at a few real frequencies. Even for a domain as elementary as a circle with a gap in it, such computations do not seem to have been done before. This is joint work with Oscar Bruno and Manuel Santana at Caltech.
High-order finite element methods for multicomponent convection-diffusion
Abstract
Multicomponent fluids are mixtures of distinct chemical species (i.e. components) that interact through complex physical processes such as cross-diffusion and chemical reactions. Additional physical phenomena often must be accounted for when modelling these fluids; examples include momentum transport, thermality and (for charged species) electrical effects. Despite the ubiquity of chemical mixtures in nature and engineering, multicomponent fluids have received almost no attention from the finite element community, with many important applications remaining out of reach from numerical methods currently available in the literature. This is in spite of the fact that, in engineering applications, these fluids often reside in complicated spatial regions -- a situation where finite elements are extremely useful! In this talk, we present a novel class of high-order finite element methods for simulating cross-diffusion and momentum transport (i.e. convection) in multicomponent fluids. Our model can also incorporate local electroneutrality when the species carry electrical charge, making the numerical methods particularly desirable for simulating liquid electrolytes in electrochemical applications. We discuss challenges that arise when discretising the partial differential equations of multicomponent flow, as well as some salient theoretical properties of our numerical schemes. Finally, we present numerical simulations involving (i) the microfluidic non-ideal mixing of hydrocarbons and (ii) the transient evolution of a lithium-ion battery electrolyte in a Hull cell electrode.
FUSE: the finite element as data
Abstract
The Ciarlet definition of a finite element has been core to our understanding of the finite element method since its inception. It has proved particularly useful in structuring the implementation of finite element software. However, the definition does not encapsulate all the details required to uniquely implement an element, meaning each user of the definition (whether a researcher or software package) must make further mathematical assumptions to produce a working system.
The talk presents a new definition built on Ciarlet’s that addresses these concerns. The novel definition forms the core of a new piece of software in development, FUSE, which allows the users to consider the choice of finite element as part of the data they are working with. This is a new implementation strategy among finite element software packages, and we will discuss some potential benefits of the development.
How to warm-start your unfolding network
Abstract
We present a new ensemble framework for boosting the performance of overparameterized unfolding networks solving the compressed sensing problem. We combine a state-of-the-art overparameterized unfolding network with a continuation technique, to warm-start a crucial quantity of the said network's architecture; we coin the resulting continued network C-DEC. Moreover, for training and evaluating C-DEC, we incorporate the log-cosh loss function, which enjoys both linear and quadratic behavior. Finally, we numerically assess C-DEC's performance on real-world images. Results showcase that the combination of continuation with the overparameterized unfolded architecture, trained and evaluated with the chosen loss function, yields smoother loss landscapes and improved reconstruction and generalization performance of C-DEC, consistently for all datasets.
Full waveform inversion using higher-order finite elements
Abstract
Inversion problems, such as full waveform inversion (FWI), based on wave propagation, are computationally costly optimization processes used in many applications, ranging from seismic imaging to brain tomography. In most of these uses, high-order methods are required for both accuracy and computational efficiency. Within finite element methods (FEM), using high(er)-order can provide accuracy and the usage of flexible meshes. However, FEM are rarely employed in connection with unstructured simplicial meshes because of the computational cost and complexity of code implementation. They are used frequently with quadrilateral or hexahedral spectral finite elements, but the mesh adaptivity on those elements has not yet been fully explored. In this work, we address these challenges by developing software that leverages accurate higher-order mass-lumped simplicial elements with a mesh-adaption parameter, allowing us to take advantage of the computational efficiency of newer mass-lumped simplicial elements together with waveform-adapted meshes and the accuracy of higher-order function spaces. We also calculate these mesh-related parameters and develop software for high-order spectral element methods, allowing mesh flexibility. We will also discuss future developments. The open-source code was implemented using the Firedrake framework and the Unified Form Language (UFL), a mathematical-based domain specific language, allowing flexibility in a wide range of wave-based problems.
Unfiltered and Filtered Low-Regularity Approaches for Nonlinear Dispersive PDEs
Abstract
In this talk, I will present low-regularity numerical methods for nonlinear dispersive PDEs, with unfiltered schemes analyzed in Sobolev spaces and filtered schemes in discrete Bourgain spaces, offering effective handling of low-regularity and even rough solutions. I will highlight the significance of exploring structure-preserving low-regularity schemes, as this is a crucial area for further research.
High-order and sparsity-promoting Stokes elements
Abstract
A posteriori error estimation for randomized low-rank approximation
Abstract
A number of algorithms are now available---including Halko-Martinsson-Tropp, interpolative decomposition, CUR, generalized Nystrom, and QR with column pivoting---for computing a low-rank approximation of matrices. Some methods come with extremely strong guarantees, while others may fail with nonnegligible probability. We present methods for efficiently estimating the error of the approximation for a specific instantiation of the methods. Such certificate allows us to execute "responsibly reckless" algorithms, wherein one tries a fast, but potentially unstable, algorithm, to obtain a potential solution; the quality of the solution is then assessed in a reliable fashion, and remedied if necessary. This is joint work with Gunnar Martinsson.
Time permitting, I will ramble about other topics in Randomised NLA.
On Objective-Free High Order Methods
Abstract
An adaptive regularization algorithm for unconstrained nonconvex optimization is presented in
which the objective function is never evaluated, but only derivatives are used and without prior knowledge of Lipschitz constant. This algorithm belongs to the class of adaptive regularization methods, for which optimal worst-case complexity results are known for the standard framework where the objective function is evaluated. It is shown in this paper that these excellent complexity bounds are also valid for the new algorithm. Theoretical analysis of both exact and stochastic cases are discussed and new probabilistic conditions on tensor derivatives are proposed. Initial experiments on large binary classification highlight the merits of our method.
Efficient Adaptive Regularized Tensor Methods
Abstract
High-order tensor methods employing local Taylor approximations have attracted considerable attention for convex and nonconvex optimisation. The pth-order adaptive regularisation (ARp) approach builds a local model comprising a pth-order Taylor expansion and a (p+1)th-order regularisation term, delivering optimal worst-case global and local convergence rates. However, for p≥2, subproblem minimisation can yield multiple local minima, and while a global minimiser is recommended for p=2, effectively identifying a suitable local minimum for p≥3 remains elusive.
This work extends interpolation-based updating strategies, originally proposed for p=2, to cases where p≥3, allowing the regularisation parameter to adapt in response to interpolation models. Additionally, it introduces a new prerejection mechanism to discard unfavourable subproblem minimisers before function evaluations, thus reducing computational costs for p≥3.
Numerical experiments, particularly on Chebyshev-Rosenbrock problems with p=3, indicate that the proper use of different minimisers can significantly improve practical performance, offering a promising direction for designing more efficient high-order methods.
Who needs a residual when an approximation will do?
Abstract
The widespread need to solve large-scale linear systems has sparked a growing interest in randomized techniques. One such class of techniques is known as iterative random sketching methods (e.g., Randomized Block Kaczmarz and Randomized Block Coordinate Descent). These methods "sketch" the linear system to generate iterative, easy-to-compute updates to a solution. By working with sketches, these methods can often enable more efficient memory operations, potentially leading to faster performance for large-scale problems. Unfortunately, tracking the progress of these methods still requires computing the full residual of the linear system, an operation that undermines the benefits of the solvers. In practice, this cost is mitigated by occasionally computing the full residual, typically after an epoch. However, this approach sacrifices real-time progress tracking, resulting in wasted computations. In this talk, we use statistical techniques to develop a progress estimation procedure that provides inexpensive, accurate real-time progress estimates at the cost of a small amount of uncertainty that we effectively control.
Preconditioners for Multicomponent Flows
Abstract
Multicomponent flows, i.e. mixtures, can be modeled effectively using the Onsager-Stefan-Maxwell (OSM) equations. The OSM equations can account for a wide variety of phenomena such as diffusive, convective, non-ideal mixing, thermal, pressure and electrochemical effects for steady and transient multicomponent flows. I will first introduce the general OSM framework and a finite element discretisation for multicomponent diffusion of ideal gasses. Then I will show two ways of preconditioning the multicomponent diffusion problem for various boundary conditions. Time permitting, I will also discuss how this can be extended to the non-ideal, thermal, and nonisobaric settings.
Local convergence of adaptively regularized tensor methods
Abstract
Tensor methods are methods for unconstrained continuous optimization that can incorporate derivative information of up to order p > 2 by computing a step based on the pth-order Taylor expansion at each iteration. The most important among them are regularization-based tensor methods which have been shown to have optimal worst-case iteration complexity of finding an approximate minimizer. Moreover, as one might expect, this worst-case complexity improves as p increases, highlighting the potential advantage of tensor methods. Still, the global complexity results only guarantee pessimistic sublinear rates, so it is natural to ask how local rates depend on the order of the Taylor expansion p. In the case of strongly convex functions and a fixed regularization parameter, the answer is given in a paper by Doikov and Nesterov from 2022: we get pth-order local convergence of function values and gradient norms.
The value of the regularization parameter in their analysis depends on the Lipschitz constant of the pth derivative. Since this constant is not usually known in advance, adaptive regularization methods are more practical. We extend the local convergence results to locally strongly convex functions and fully adaptive methods.
We discuss how for p > 2 it becomes crucial to select the "right" minimizer of the regularized local model in each iteration to ensure all iterations are eventually successful. Counterexamples show that in particular the global minimizer of the subproblem is not suitable in general. If the right minimizer is used, the pth-order local convergence is preserved, otherwise the rate stays superlinear but with an exponent arbitrarily close to one depending on the algorithm parameters.
Structure-preserving discretisation for magneto-frictional equations in the Parker conjecture
Abstract
The Parker conjecture, which explores whether magnetic fields in perfectly conducting plasmas can develop tangential discontinuities during magnetic relaxation, remains an open question in astrophysics. Helicity conservation provides a topological barrier against topologically nontrivial initial data relaxing to a trivial solution. Preserving this mechanism is therefore crucial for numerical simulation.
This paper presents an energy- and helicity-preserving finite element discretization for the magneto-frictional system for investigating the Parker conjecture. The algorithm enjoys a discrete version of the topological mechanism and a discrete Arnold inequality.
We will also discuss extensions to domains with nontrivial topology.
This is joint work with Prof Patrick Farrell, Dr Kaibo Hu, and Boris Andrews
Efficient SAA Methods for Hyperparameter Estimation in Bayesian Inverse Problems
Abstract
In Bayesian inverse problems, it is common to consider several hyperparameters that define the prior and the noise model that must be estimated from the data. In particular, we are interested in linear inverse problems with additive Gaussian noise and Gaussian priors defined using Matern covariance models. In this case, we estimate the hyperparameters using the maximum a posteriori (MAP) estimate of the marginalized posterior distribution.
However, this is a computationally intensive task since it involves computing log determinants. To address this challenge, we consider a stochastic average approximation (SAA) of the objective function and use the preconditioned Lanczos method to compute efficient function evaluation approximations.
We can therefore compute the MAP estimate of the hyperparameters efficiently by building a preconditioner which can be updated cheaply for new values of the hyperparameters; and by leveraging numerical linear algebra tools to reuse information efficiently for computing approximations of the gradient evaluations. We demonstrate the performance of our approach on inverse problems from tomography.
Distributional Complexes in two and three dimensions
Abstract
In recent years, some progress has been made in the development of finite element complexes, particularly in the discretization of BGG complexes in two and three dimensions, including Hessian complexes, elasticity complexes, and divdiv complexes. In this talk, I will discuss distributional complexes in two and three dimensions. These complexes are simply constructed using geometric concepts such as vertices, edges, and faces, and they share the same cohomology as the complexes at the continuous level, which reflects that the discretization is structure preserving. The results can be regarded as a tensor generalization of the Whitney forms of the finite element exterior calculus. This talk is based on joint work with Snorre Christiansen (Oslo), Kaibo Hu (Edinburgh), and Qian Zhang (Michigan).
Multirevolution integrators for stochastic multiscale dynamics with fast stochastic oscillations
Abstract
We introduce a new methodology based on the multirevolution idea for constructing integrators for stochastic differential equations in the situation where the fast oscillations themselves are driven by a Stratonovich noise. Applications include in particular highly-oscillatory Kubo oscillators and spatial discretizations of the nonlinear Schrödinger equation with fast white noise dispersion. We construct a method of weak order two with computational cost and accuracy both independent of the stiffness of the oscillations. A geometric modification that conserves exactly quadratic invariants is also presented. If time allows, we will discuss ongoing work on uniformly accurate methods for such systems. This is a joint work with Gilles Vilmart.
Backward error for nonlinear eigenvalue problems
Abstract
The backward error analysis is an important part of the perturbation theory and it is particularly useful for the study of the reliability of the numerical methods. We focus on the backward error for nonlinear eigenvalue problems. In this talk, the matrix-valued function is given as a linear combination of scalar functions multiplying matrix coefficients, and the perturbation is done on the coefficients. We provide theoretical results about the backward error of a set of approximate eigenpairs. Indeed, small backward errors for separate eigenpairs do not imply small backward errors for a set of approximate eigenpairs. In this talk, we provide inexpensive upper bounds, and a way to accurately compute the backward error by means of direct computations or through Riemannian optimization. We also discuss how the backward error can be determined when the matrix coefficients of the matrix-valued function have particular structures (such as symmetry, sparsity, or low-rank), and the perturbations are required to preserve them. For special cases (such as for symmetric coefficients), explicit and inexpensive formulas to compute the perturbed matrix coefficients are also given. This is a joint work with Leonardo Robol (University of Pisa).
Structure-preserving low-regularity integrators for dispersive nonlinear equations
Abstract
Dispersive nonlinear partial differential equations can be used to describe a range of physical systems, from water waves to spin states in ferromagnetism. The numerical approximation of solutions with limited differentiability (low-regularity) is crucial for simulating fascinating phenomena arising in these systems including emerging structures in random wave fields and dynamics of domain wall states, but it poses a significant challenge to classical algorithms. Recent years have seen the development of tailored low-regularity integrators to address this challenge. Inherited from their description of physicals systems many such dispersive nonlinear equations possess a rich geometric structure, such as a Hamiltonian formulation and conservation laws. To ensure that numerical schemes lead to meaningful results, it is vital to preserve this structure in numerical approximations. This, however, results in an interesting dichotomy: the rich theory of existent structure-preserving algorithms is typically limited to classical integrators that cannot reliably treat low-regularity phenomena, while most prior designs of low-regularity integrators break geometric structure in the equation. In this talk, we will outline recent advances incorporating structure-preserving properties into low-regularity integrators. Starting from simple discussions on the nonlinear Schrödinger and the Korteweg–de Vries equation we will discuss the construction of such schemes for a general class of dispersive equations before demonstrating an application to the simulation of low-regularity vortex filaments. This is joint work with Yvonne Alama Bronsard, Valeria Banica, Yvain Bruned and Katharina Schratz.