Thu, 18 Jan 2024

14:00 - 15:00
Rutherford Appleton Laboratory, nr Didcot

A preconditioner with low-rank corrections based on the Bregman divergence

Andreas Bock
(Danish Technical University)
Abstract

We present a general framework for preconditioning Hermitian positive definite linear systems based on the Bregman log determinant divergence. This divergence provides a measure of discrepancy between a preconditioner and a target matrix, giving rise to

the study of preconditioners given as the sum of a Hermitian positive definite matrix plus a low-rank correction. We describe under which conditions the preconditioner minimises the $\ell^2$ condition number of the preconditioned matrix, and obtain the low-rank 

correction via a truncated singular value decomposition (TSVD). Numerical results from variational data assimilation (4D-VAR) support our theoretical results.

 

We also apply the framework to approximate factorisation preconditioners with a low-rank correction (e.g. incomplete Cholesky plus low-rank). In such cases, the approximate factorisation error is typically indefinite, and the low-rank correction described by the Bregman divergence is generally different from one obtained as a TSVD. We compare these two truncations in terms of convergence of the preconditioned conjugate gradient method (PCG), and show numerous examples where PCG converges to a small tolerance using the proposed preconditioner, whereas PCG using a TSVD-based preconditioner fails. We also consider matrices arising from interior point methods for linear programming that do not admit such an incomplete factorisation by default, and present a robust incomplete Cholesky preconditioner based on the proposed methodology.

The talk is based on papers with Martin S. Andersen (DTU).

 

Thu, 02 May 2024

14:00 - 15:00
Lecture Room 3

Mathematics: key enabling technology for scientific machine learning

Wil Schilders
(TU Eindhoven)
Abstract

Artificial Intelligence (AI) will strongly determine our future prosperity and well-being. Due to its generic nature, AI will have an impact on all sciences and business sectors, our private lives and society as a whole. AI is pre-eminently a multidisciplinary technology that connects scientists from a wide variety of research areas, from behavioural science and ethics to mathematics and computer science.

Without downplaying the importance of that variety, it is apparent that mathematics can and should play an active role. All the more so as, alongside the successes of AI, also critical voices are increasingly heard. As Robert Dijkgraaf (former director of the Princeton Institute of Advanced Studies) observed in May 2019: ”Artificial intelligence is in its adolescent phase, characterised by trial and error, self-aggrandisement, credulity and lack of systematic understanding.” Mathematics can contribute to the much-needed systematic understanding of AI, for example, greatly improving reliability and robustness of AI algorithms, understanding the operation and sensitivity of networks, reducing the need for abundant data sets, or incorporating physical properties into neural networks needed for super-fast and accurate simulations in the context of digital twinning.

Mathematicians absolutely recognize the potential of artificial intelligence, machine learning and (deep) neural networks for future developments in science, technology and industry. At the same time, a sound mathematical treatment is essential for all aspects of artificial intelligence, including imaging, speech recognition, analysis of texts or autonomous driving, implying it is essential to involve mathematicians in all these areas. In this presentation, we highlight the role of mathematics as a key enabling technology within the emerging field of scientific machine learning. Or, as I always say: ”Real intelligence is needed to make artificial intelligence work.”

 

Thu, 25 Apr 2024

14:00 - 15:00
Lecture Room 3

ESPIRA: Estimation of Signal Parameters via Iterative Rational Approximation

Nadiia Derevianko
(University of Göttingen)
Abstract

We introduce a new method - ESPIRA (Estimation of Signal Parameters via Iterative Rational Approximation) \cite{DP22,  DPP21} - for the recovery of complex exponential  sums
$$
f(t)=\sum_{j=1}^{M} \gamma_j \mathrm{e}^{\lambda_j t},
$$
that are determined by a finite number of parameters: the order $M$, weights $\gamma_j \in \mathbb{C} \setminus \{0\}$ and nodes  $\mathrm{e}^{\lambda_j} \in \mathbb{C}$ for $j=1,...,M$.  Our new recovery procedure is based on the observation that Fourier coefficients (or DFT coefficients) of exponential sums have a special rational structure.  To  reconstruct this structure in a stable way we use the AAA algorithm  proposed by Nakatsukasa et al.   We show that ESPIRA can be interpreted as a matrix pencil method applied to Loewner matrices. 

During the talk we will demonstrate that ESPIRA outperforms Prony-like methods such as ESPRIT and MPM for noisy data and for signal approximation by short exponential sums.  

 

Bibliography
N. Derevianko,  G.  Plonka, 
Exact reconstruction of extended exponential sums using rational approximation of their Fourier coefficients, Anal.  Appl.,  20(3),  2022,  543-577.


N. Derevianko,  G. Plonka,  M. Petz, 
From ESPRIT to ESPIRA: Estimation of signal parameters by iterative rational approximation,   IMA J. Numer. Anal.,  43(2),  2023, 789--827.  


Y. Nakatsukasa, O. Sète,   L.N. Trefethen,  The AAA algorithm for rational approximation.
SIAM J. Sci. Comput., 40(3),   2018,  A1494–A1522.  

Thu, 01 Feb 2024
14:00
Lecture Room 3

A strongly polynomial algorithm for the minimum-cost generalized flow problem

Laszlo Vegh
(LSE)
Abstract

We give a strongly polynomial algorithm for minimum cost generalized flow, and as a consequence, for all linear programs with at most two nonzero entries per row, or at most two nonzero entries per column. While strongly polynomial algorithms for the primal and dual feasibility problems have been known for a long time, various combinatorial approaches used for those problems did not seem to carry over to the minimum-cost variant.

Our approach is to show that the ‘subspace layered least squares’ interior point method, an earlier joint work with Allamigeon, Dadush, Loho and Natura requires only a strongly polynomial number of iterations for minimum cost generalized flow. We achieve this by bounding the straight line complexity, introduced in the same paper. The talk will give an overview of the interior point method as well as the combinatorial straight-line complexity analysis for the particular setting. This is joint work with Daniel Dadush, Zhuan Khye Koh, Bento Natura, and Neil Olver.

Thu, 26 Oct 2023
14:00
Lecture Room 3

Algebraic domain-decomposition preconditioners for the solution of linear systems

Tyrone Rees
(Rutherford Appleton Laboratory)
Abstract

The need to solve linear systems of equations is ubiquitous in scientific computing. Powerful methods for preconditioning such systems have been developed in cases where we can exploit knowledge of the origin of the linear system; a recent example from the solution of systems from PDEs is the Gen-EO domain decomposition method which works well, but requires a non-trival amount of knowledge of the underlying problem to implement.  

In this talk I will present a new spectral coarse space that can be constructed in a fully-algebraic way, in contrast to most existing spectral coarse spaces, and will give a theoretical convergence result for Hermitian positive definite diagonally dominant matrices. Numerical experiments and comparisons against state-of-the-art preconditioners in the multigrid community show that the resulting two-level Schwarz preconditioner is efficient, especially for non-self-adjoint operators. Furthermore, in this case, our proposed preconditioner outperforms state-of-the-art preconditioners.

This is joint work with Hussam Al Daas, Pierre Jolivet and Jennifer Scott.

Thu, 07 Mar 2024

14:00 - 15:00
Lecture Room 3

Stabilized Lagrange-Galerkin schemes for viscous and viscoelastic flow problems

Hirofumi Notsu
(Kanazawa University)
Abstract

Many researchers are developing stable and accurate numerical methods for flow problems, which roughly belong to upwind methods or characteristics(-based) methods. 
The Lagrange-Galerkin method proposed and analyzed in, e.g., [O. Pironneau. NM, 1982] and [E. S\"uli. NM, 1988] is the finite element method combined with the idea of the method of characteristics; hence, it belongs to the characteristics(-based) methods. The advantages are the CFL-free robustness for convection-dominated problems and the symmetry of the resulting coefficient matrix. In this talk, we introduce stabilized Lagrange-Galerkin schemes of second order in time for viscous and viscoelastic flow problems, which employ the cheapest conforming P1-element with the help of pressure-stabilization [F. Brezzi and J. Pitk\"aranta. Vieweg+Teubner, 1984] for all the unknown functions, i.e., velocity, pressure, and conformation tensor, reducing the number of DOFs. 
Focusing on the recent developments of discretizations of the (non-conservative and conservative) material derivatives and the upper-convected time derivative, we present theoretical and numerical results.

Thu, 19 Oct 2023

14:00 - 15:00
Lecture Room 3

Randomized Least Squares Optimization and its Incredible Utility for Large-Scale Tensor Decomposition

Tammy Kolda
(mathsci.ai)
Abstract

Randomized least squares is a promising method but not yet widely used in practice. We show an example of its use for finding low-rank canonical polyadic (CP) tensor decompositions for large sparse tensors. This involves solving a sequence of overdetermined least problems with special (Khatri-Rao product) structure.

In this work, we present an application of randomized algorithms to fitting the CP decomposition of sparse tensors, solving a significantly smaller sampled least squares problem at each iteration with probabilistic guarantees on the approximation errors. We perform sketching through leverage score sampling, crucially relying on the fact that the problem structure enable efficient sampling from overestimates of the leverage scores with much less work. We discuss what it took to make the algorithm practical, including general-purpose improvements.

Numerical results on real-world large-scale tensors show the method is faster than competing methods without sacrificing accuracy.

*This is joint work with Brett Larsen, Stanford University.

Thu, 12 Oct 2023

14:00 - 15:00
Lecture Room 3

Hermitian preconditioning for a class of non-Hermitian linear systems

Nicole Spillane
(Ecole Polytechnique (CMAP))
Abstract

This work considers weighted and preconditioned GMRES. The objective is to provide a way of choosing the preconditioner and the inner product, also called weight, that ensure fast convergence. The main focus of the article is on Hermitian preconditioning (even for non-Hermitian problems).

It is indeed proposed to choose a Hermitian preconditioner H, and to apply GMRES in the inner product induced by H. If moreover, the problem matrix A is positive definite, then a new convergence bound is proved that depends only on how well H preconditions the Hermitian part of A, and on a measure of how non-Hermitian A is. In particular, if a scalable preconditioner is known for the Hermitian part of A, then the proposed method is also scalable. I will also illustrate this result numerically.

Thu, 30 Nov 2023
14:00
Lecture Room 3

Multilevel adaptivity for stochastic finite element methods

Alex Bespalov
(Birmingham University)
Abstract

This talk concerns the design and analysis of adaptive FEM-based solution strategies for partial differential equations (PDEs) with uncertain or parameter-dependent inputs. We present two conceptually different strategies: one is projection-based (stochastic Galerkin FEM) and the other is sampling-based (stochastic collocation FEM). These strategies have emerged and become popular as effective alternatives to Monte-Carlo sampling in the context of (forward) uncertainty quantification. Both stochastic Galerkin and stochastic collocation approximations are typically represented as finite (sparse) expansions in terms of a parametric polynomial basis with spatial coefficients residing in finite element spaces. The focus of the talk is on multilevel approaches where different spatial coefficients may reside in different finite element spaces and, therefore, the underlying spatial approximations are allowed to be refined independently from each other.

 

We start with a more familiar setting of projection-based methods, where exploiting the Galerkin orthogonality property and polynomial approximations in terms of an orthonormal basis facilitates the design and analysis of adaptive algorithms. We discuss a posteriori error estimation as well as the convergence and rate optimality properties of the generated adaptive multilevel Galerkin approximations for PDE problems with affine-parametric coefficients. We then show how these ideas of error estimation and multilevel adaptivity can be applied in a non-Galerkin setting of stochastic collocation FEM, in particular, for PDE problems with non-affine parameterization of random inputs and for problems with parameter-dependent local spatial features.

 

The talk is based on a series of joint papers with Dirk Praetorius (TU Vienna), Leonardo Rocchi (Birmingham), Michele Ruggeri (University of Strathclyde, Glasgow), David Silvester (Manchester), and Feng Xu (Manchester).

Thu, 23 Nov 2023
14:00
Lecture Room 3

Making SGD parameter-free

Oliver Hinder
(University of Pittsburgh)
Abstract

We develop an algorithm for parameter-free stochastic convex optimization (SCO) whose rate of convergence is only a double-logarithmic factor larger than the optimal rate for the corresponding known-parameter setting. In contrast, the best previously known rates for parameter-free SCO are based on online parameter-free regret bounds, which contain unavoidable excess logarithmic terms compared to their known-parameter counterparts. Our algorithm is conceptually simple, has high-probability guarantees, and is also partially adaptive to unknown gradient norms, smoothness, and strong convexity. At the heart of our results is a novel parameter-free certificate for the step size of stochastic gradient descent (SGD), and a time-uniform concentration result that assumes no a-priori bounds on SGD iterates.

Additionally, we present theoretical and numerical results for a dynamic step size schedule for SGD based on a variant of this idea. On a broad range of vision and language transfer learning tasks our methods performance is close to that of SGD with tuned learning rate. Also, a per-layer variant of our algorithm approaches the performance of tuned ADAM.

This talk is based on papers with Yair Carmon and Maor Ivgi.

Subscribe to Computational Mathematics and Applications Seminar