Forthcoming events in this series


Thu, 13 Jun 2024

14:00 - 15:00
Lecture Room 3

A New Two-Dimensional Model-Based Subspace Method for Large-Scale Unconstrained Derivative-Free Optimization: 2D-MoSub

Pengcheng Xie
(Chinese Academy of Sciences)
Abstract

This seminar will introduce 2D-MoSub, a derivative-free optimization method based on the subspace method and quadratic models, specifically tackling large-scale derivative-free problems. 2D-MoSub combines 2-dimensional quadratic interpolation models and trust-region techniques to update the points and explore the 2-dimensional subspace iteratively. Its framework includes constructing the interpolation set, building the quadratic interpolation model, performing trust-region trial steps, and updating the trust-region radius and subspace. Computation details and theoretical properties will be discussed. Numerical results demonstrate the advantage of 2D-MoSub.

 

Short Bio:
Pengcheng Xie, PhD (Chinese Academy of Sciences), is joining Lawrence Berkeley National Laboratory as a postdoctoral scholar specializing in mathematical optimization and numerical analysis. He has developed optimization methods, including 2D-MoSub and SUSD-TR. Pengcheng has published in major journals and presented at ISMP 2024 (upcoming), ICIAM 2023, and CSIAM 2022. He received the Hua Loo-keng scholarship in 2019 and the CAS-AMSS Presidential scholarship in 2023.
 

Thu, 06 Jun 2024

14:00 - 15:00
Lecture Room 3

Structure-preserving hybrid finite element methods

Ari Stern
(Washington University in St. Louis, USA)
Abstract

The classical finite element method uses piecewise-polynomial function spaces satisfying continuity and boundary conditions. Hybrid finite element methods, by contrast, drop these continuity and boundary conditions from the function spaces and instead enforce them weakly using Lagrange multipliers. The hybrid approach has several numerical and implementational advantages, which have been studied over the last few decades.

 

In this talk, we show how the hybrid perspective has yielded new insights—and new methods—in structure-preserving numerical PDEs. These include multisymplectic methods for Hamiltonian PDEs, charge-conserving methods for the Maxwell and Yang-Mills equations, and hybrid methods in finite element exterior calculus.

Thu, 30 May 2024

14:00 - 15:00
Rutherford Appleton Laboratory, nr Didcot

This seminar has been cancelled

Marta Betcke
(University College London)
Abstract

Joint work Marta Betcke and Bolin Pan

In photoacoustic tomography (PAT) with flat sensor, we routinely encounter two types of limited data. The first is due to using a finite sensor and is especially perceptible if the region of interest is large relatively to the sensor or located farther away from the sensor. In this talk we focus on the second type caused by a varying sensitivity of the sensor to the incoming wavefront direction which can be modelled as binary i.e. by a cone of sensitivity. Such visibility conditions result, in Fourier domain, in a restriction of the data to a bowtie, akin to the one corresponding to the range of the forward operator but further narrowed according to the angle of sensitivity. 

We show how we can separate the visible and invisible wavefront directions in PAT image and data using a directional frame like Curvelets, and how such decomposition allows for decoupling of the reconstruction involving application of expensive forward/adjoint solvers from the training problem. We present fast and stable approximate Fourier domain forward and adjoint operators for reconstruction of the visible coefficients for such limited angle problem and a tailored UNet matching both the multi-scale Curvelet decomposition and the partition into the visible/invisible directions for learning the invisible coefficients from a training set of similar data.

Thu, 23 May 2024

14:00 - 15:00
Lecture Room 3

The bilevel optimization renaissance through machine learning: lessons and challenges

Alain Zemkoho
(University of Southampton)
Abstract

Bilevel optimization has been part of machine learning for over 4 decades now, although perhaps not always in an obvious way. The interconnection between the two topics started appearing more clearly in publications since about 20 years now, and in the last 10 years, the number of machine learning applications of bilevel optimization has literally exploded. This rise of bilevel optimization in machine learning has been highly positive, as it has come with many innovations in the theoretical and numerical perspectives in understanding and solving the problem, especially with the rebirth of the implicit function approach, which seemed to have been abandoned at some point.
Overall, machine learning has set the bar very high for the whole field of bilevel optimization with regards to the development of numerical methods and the associated convergence analysis theory, as well as the introduction of efficient tools to speed up components such as derivative calculations among other things. However, it remains unclear how the techniques from the machine learning—based bilevel optimization literature can be extended to other applications of bilevel programming. 
For instance, many machine learning loss functions and the special problem structures enable the fulfillment of some qualification conditions that will fail for multiple other applications of bilevel optimization. In this talk, we will provide an overview of machine learning applications of bilevel optimization while giving a flavour of corresponding solution algorithms and their limitations. 
Furthermore, we will discuss possible paths for algorithms that can tackle more complicated machine learning applications of bilevel optimization, while also highlighting lessons that can be learned for more general bilevel programs.

Thu, 16 May 2024

14:00 - 15:00
Lecture Room 3

Multilevel Monte Carlo methods for the approximation of failure probability regions

Matteo Croci
(Basque Center for Applied Mathematics)
Abstract

In this talk, we consider the problem of approximating failure regions. More specifically, given a costly computational model with random parameters and a failure condition, our objective is to determine the parameter region in which the failure condition is likely to not be satisfied. In mathematical terms, this problem can be cast as approximating the level set of a probability density function. We solve this problem by dividing it into two: 1) The design of an efficient Monte Carlo strategy for probability estimation. 2) The construction of an efficient algorithm for level-set approximation. Following this structure, this talk is comprised of two parts:

In the first part, we present a new multi-output multilevel best linear unbiased estimator (MLBLUE) for approximating expectations. The advantage of this estimator is in its convenience and optimality: Given any set of computational models with known covariance structure, MLBLUE automatically constructs a provenly optimal estimator for any (finite) number of quantities of interest. Nevertheless, the optimality of MLBLUE is tied to its optimal set-up, which requires the solution of a nonlinear optimization problem. We show how the latter can be reformulated as a semi-definite program and thus be solved reliably and efficiently.

In the second part, we construct an adaptive level-set approximation algorithm for smooth functions corrupted by noise in $\mathbb{R}^d$. This algorithm only requires point value data and is thus compatible with Monte Carlo estimators. The algorithm is comprised of a criterion for level-set adaptivity combined with an a posteriori error estimator. Under suitable assumptions, we can prove that our algorithm will correctly capture the target level set at the same cost complexity of uniformly approximating a $(d-1)$-dimensional function.

Thu, 09 May 2024

14:00 - 15:00
Lecture Room 4

Fast optimistic methods for monotone equations and convex optimization problems

Radu Bot
(University of Vienna)
Further Information

 

Please note; the seminar is taking place in Lecture Room 4 on this occasion 

Abstract

In this talk, we discuss continuous in time dynamics for the problem of approaching the set of zeros of a single-valued monotone and continuous operator V . Such problems are motivated by minimax convexconcave and, in particular, by convex optimization problems with linear constraints. The central role is played by a second-order dynamical system that combines a vanishing damping term with the time derivative of V along the trajectory, which can be seen as an analogous of the Hessian-driven damping in case the operator is originating from a potential. We show that these methods exhibit fast convergence rates for kV (z(t))k as t ! +1, where z( ) denotes the generated trajectory, and for the restricted gap function, and that z( ) converges to a zero of the operator V . For the corresponding implicit and explicit discrete time models with Nesterov’s momentum, we prove that they share the asymptotic features of the continuous dynamics.

Extensions to variational inequalities and fixed-point problems are also addressed. The theoretical results are illustrated by numerical experiments on bilinear games and the training of generative adversarial networks.

Thu, 02 May 2024

14:00 - 15:00
Lecture Room 3

Mathematics: key enabling technology for scientific machine learning

Wil Schilders
(TU Eindhoven)
Abstract

Artificial Intelligence (AI) will strongly determine our future prosperity and well-being. Due to its generic nature, AI will have an impact on all sciences and business sectors, our private lives and society as a whole. AI is pre-eminently a multidisciplinary technology that connects scientists from a wide variety of research areas, from behavioural science and ethics to mathematics and computer science.

Without downplaying the importance of that variety, it is apparent that mathematics can and should play an active role. All the more so as, alongside the successes of AI, also critical voices are increasingly heard. As Robert Dijkgraaf (former director of the Princeton Institute of Advanced Studies) observed in May 2019: ”Artificial intelligence is in its adolescent phase, characterised by trial and error, self-aggrandisement, credulity and lack of systematic understanding.” Mathematics can contribute to the much-needed systematic understanding of AI, for example, greatly improving reliability and robustness of AI algorithms, understanding the operation and sensitivity of networks, reducing the need for abundant data sets, or incorporating physical properties into neural networks needed for super-fast and accurate simulations in the context of digital twinning.

Mathematicians absolutely recognize the potential of artificial intelligence, machine learning and (deep) neural networks for future developments in science, technology and industry. At the same time, a sound mathematical treatment is essential for all aspects of artificial intelligence, including imaging, speech recognition, analysis of texts or autonomous driving, implying it is essential to involve mathematicians in all these areas. In this presentation, we highlight the role of mathematics as a key enabling technology within the emerging field of scientific machine learning. Or, as I always say: ”Real intelligence is needed to make artificial intelligence work.”

 

Thu, 25 Apr 2024

14:00 - 15:00
Lecture Room 3

ESPIRA: Estimation of Signal Parameters via Iterative Rational Approximation

Nadiia Derevianko
(University of Göttingen)
Abstract

We introduce a new method - ESPIRA (Estimation of Signal Parameters via Iterative Rational Approximation) \cite{DP22,  DPP21} - for the recovery of complex exponential  sums
$$
f(t)=\sum_{j=1}^{M} \gamma_j \mathrm{e}^{\lambda_j t},
$$
that are determined by a finite number of parameters: the order $M$, weights $\gamma_j \in \mathbb{C} \setminus \{0\}$ and nodes  $\mathrm{e}^{\lambda_j} \in \mathbb{C}$ for $j=1,...,M$.  Our new recovery procedure is based on the observation that Fourier coefficients (or DFT coefficients) of exponential sums have a special rational structure.  To  reconstruct this structure in a stable way we use the AAA algorithm  proposed by Nakatsukasa et al.   We show that ESPIRA can be interpreted as a matrix pencil method applied to Loewner matrices. 

During the talk we will demonstrate that ESPIRA outperforms Prony-like methods such as ESPRIT and MPM for noisy data and for signal approximation by short exponential sums.  

 

Bibliography
N. Derevianko,  G.  Plonka, 
Exact reconstruction of extended exponential sums using rational approximation of their Fourier coefficients, Anal.  Appl.,  20(3),  2022,  543-577.


N. Derevianko,  G. Plonka,  M. Petz, 
From ESPRIT to ESPIRA: Estimation of signal parameters by iterative rational approximation,   IMA J. Numer. Anal.,  43(2),  2023, 789--827.  


Y. Nakatsukasa, O. Sète,   L.N. Trefethen,  The AAA algorithm for rational approximation.
SIAM J. Sci. Comput., 40(3),   2018,  A1494–A1522.  

Mon, 08 Apr 2024

11:00 - 12:00
Lecture Room 3

Heavy-Tailed Large Deviations and Sharp Characterization of Global Dynamics of SGDs in Deep Learning

Chang-Han Rhee
(Northwestern University, USA)
Abstract

While the typical behaviors of stochastic systems are often deceptively oblivious to the tail distributions of the underlying uncertainties, the ways rare events arise are vastly different depending on whether the underlying tail distributions are light-tailed or heavy-tailed. Roughly speaking, in light-tailed settings, a system-wide rare event arises because everything goes wrong a little bit as if the entire system has conspired up to provoke the rare event (conspiracy principle), whereas, in heavy-tailed settings, a system-wide rare event arises because a small number of components fail catastrophically (catastrophe principle). In the first part of this talk, I will introduce the recent developments in the theory of large deviations for heavy-tailed stochastic processes at the sample path level and rigorously characterize the catastrophe principle for such processes. 

The empirical success of deep learning is often attributed to the mysterious ability of stochastic gradient descents (SGDs) to avoid sharp local minima in the loss landscape, as sharp minima are believed to lead to poor generalization. To unravel this mystery and potentially further enhance such capability of SGDs, it is imperative to go beyond the traditional local convergence analysis and obtain a comprehensive understanding of SGDs' global dynamics within complex non-convex loss landscapes. In the second part of this talk, I will characterize the global dynamics of SGDs building on the heavy-tailed large deviations and local stability framework developed in the first part. This leads to the heavy-tailed counterparts of the classical Freidlin-Wentzell and Eyring-Kramers theories. Moreover, we reveal a fascinating phenomenon in deep learning: by injecting and then truncating heavy-tailed noises during the training phase, SGD can almost completely avoid sharp minima and hence achieve better generalization performance for the test data.  

 

This talk is based on the joint work with Mihail Bazhba, Jose Blanchet, Bohan Chen, Sewoong Oh, Zhe Su, Xingyu Wang, and Bert Zwart.

Thu, 07 Mar 2024

14:00 - 15:00
Lecture Room 3

Stabilized Lagrange-Galerkin schemes for viscous and viscoelastic flow problems

Hirofumi Notsu
(Kanazawa University)
Abstract

Many researchers are developing stable and accurate numerical methods for flow problems, which roughly belong to upwind methods or characteristics(-based) methods. 
The Lagrange-Galerkin method proposed and analyzed in, e.g., [O. Pironneau. NM, 1982] and [E. S\"uli. NM, 1988] is the finite element method combined with the idea of the method of characteristics; hence, it belongs to the characteristics(-based) methods. The advantages are the CFL-free robustness for convection-dominated problems and the symmetry of the resulting coefficient matrix. In this talk, we introduce stabilized Lagrange-Galerkin schemes of second order in time for viscous and viscoelastic flow problems, which employ the cheapest conforming P1-element with the help of pressure-stabilization [F. Brezzi and J. Pitk\"aranta. Vieweg+Teubner, 1984] for all the unknown functions, i.e., velocity, pressure, and conformation tensor, reducing the number of DOFs. 
Focusing on the recent developments of discretizations of the (non-conservative and conservative) material derivatives and the upper-convected time derivative, we present theoretical and numerical results.

Thu, 29 Feb 2024

14:00 - 15:00
Lecture Room 3

On the use of "conventional" unconstrained minimization solvers for training regression problems in scientific machine learning

Stefano Zampini
(King Abdullah University of Science and Technology (KAUST))
Abstract

In recent years, we have witnessed the emergence of scientific machine learning as a data-driven tool for the analysis, by means of deep-learning techniques, of data produced by computational science and engineering applications.  At the core of these methods is the supervised training algorithm to learn the neural network realization, a highly non-convex optimization problem that is usually solved using stochastic gradient methods.

However, distinct from deep-learning practice, scientific machine-learning training problems feature a much larger volume of smooth data and better characterizations of the empirical risk functions, which make them suited for conventional solvers for unconstrained optimization.

In this talk, we empirically demonstrate the superior efficacy of a trust region method based on the Gauss-Newton approximation of the Hessian in improving the generalization errors arising from regression tasks when learning surrogate models for a wide range of scientific machine-learning techniques and test cases. All the conventional solvers tested, including L-BFGS and inexact Newton with line-search, compare favorably, either in terms of cost or accuracy, with the adaptive first-order methods used to validate the surrogate models.

Thu, 22 Feb 2024

14:00 - 15:00
Lecture Room 3

Hierarchical adaptive low-rank format with applications to discretized PDEs

Leonardo Robol
(University of Pisa)
Abstract

A novel framework for hierarchical low-rank matrices is proposed that combines an adaptive hierarchical partitioning of the matrix with low-rank approximation. One typical application is the approximation of discretized functions on rectangular domains; the flexibility of the format makes it possible to deal with functions that feature singularities in small, localized regions. To deal with time evolution and relocation of singularities, the partitioning can be dynamically adjusted based on features of the underlying data. Our format can be leveraged to efficiently solve linear systems with Kronecker product structure, as they arise from discretized partial differential equations (PDEs). For this purpose, these linear systems are rephrased as linear matrix equations and a recursive solver is derived from low-rank updates of such equations. 
We demonstrate the effectiveness of our framework for stationary and time-dependent, linear and nonlinear PDEs, including the Burgers' and Allen–Cahn equations.

This is a joint work with Daniel Kressner and Stefano Massei.

Thu, 15 Feb 2024
14:00

Algorithmic Insurance

Agni Orfanoudaki
(Oxford University Saïd Business School)
Abstract

As machine learning algorithms get integrated into the decision-making process of companies and organizations, insurance products are being developed to protect their providers from liability risk. Algorithmic liability differs from human liability since it is based on data-driven models compared to multiple heterogeneous decision-makers and its performance is known a priori for a given set of data. Traditional actuarial tools for human liability do not consider these properties, primarily focusing on the distribution of historical claims. We propose, for the first time, a quantitative framework to estimate the risk exposure of insurance contracts for machine-driven liability, introducing the concept of algorithmic insurance. Our work provides ML model developers and insurance providers with a comprehensive risk evaluation approach for this new class of products. Thus, we set the foundations of a niche area of research at the intersection of the literature in operations, risk management, and actuarial science. Specifically, we present an optimization formulation to estimate the risk exposure of a binary classification model given a pre-defined range of premiums. Our approach outlines how properties of the model, such as discrimination performance, interpretability, and generalizability, can influence the insurance contract evaluation. To showcase a practical implementation of the proposed framework, we present a case study of medical malpractice in the context of breast cancer detection. Our analysis focuses on measuring the effect of the model parameters on the expected financial loss and identifying the aspects of algorithmic performance that predominantly affect the risk of the contract.

Paper Reference: Bertsimas, D. and Orfanoudaki, A., 2021. Pricing algorithmic insurance. arXiv preprint arXiv:2106.00839.

Paper link: https://arxiv.org/pdf/2106.00839.pdf

Thu, 08 Feb 2024
14:00
Lecture Room 3

From Chebfun3 to RTSMS: A journey into deterministic and randomized Tucker decompositions

Behnam Hashemi
(Leicester University)
Abstract
The Tucker decomposition is a family of representations that break up a given d-dimensional tensor into the multilinear product of a core tensor and a factor matrix along each of the d-modes. It is a useful tool in extracting meaningful insights from complex datasets and has found applications in various fields, including scientific computing, signal processing and machine learning. 
 In this talk we will first focus on the continuous framework and revisit how Tucker decomposition forms the foundation of Chebfun3 for numerical computing with 3D functions and the deterministic algorithm behind Chebfun3. The key insight is that separation of variables achieved via low-rank Tucker decomposition simplifies and speeds up lots of subsequent computations.
 We will then switch to the discrete framework and discuss a new algorithm called RTSMS (randomized Tucker with single-mode sketching). The single-mode sketching aspect of RTSMS allows utilizing simple sketch matrices which are substantially smaller than alternative methods leading to considerable performance gains. Within its least-squares strategy, RTSMS incorporates leverage scores for efficiency with Tikhonov regularization and iterative refinement for stability. RTSMS is demonstrated to be competitive with existing methods, sometimes outperforming them by a large margin.
We illustrate the benefits of Tucker decomposition via MATLAB demos solving problems from global optimization to video compression. RTSMS is joint work with Yuji Nakatsukasa.
Thu, 01 Feb 2024
14:00
Lecture Room 3

A strongly polynomial algorithm for the minimum-cost generalized flow problem

Laszlo Vegh
(LSE)
Abstract

We give a strongly polynomial algorithm for minimum cost generalized flow, and as a consequence, for all linear programs with at most two nonzero entries per row, or at most two nonzero entries per column. While strongly polynomial algorithms for the primal and dual feasibility problems have been known for a long time, various combinatorial approaches used for those problems did not seem to carry over to the minimum-cost variant.

Our approach is to show that the ‘subspace layered least squares’ interior point method, an earlier joint work with Allamigeon, Dadush, Loho and Natura requires only a strongly polynomial number of iterations for minimum cost generalized flow. We achieve this by bounding the straight line complexity, introduced in the same paper. The talk will give an overview of the interior point method as well as the combinatorial straight-line complexity analysis for the particular setting. This is joint work with Daniel Dadush, Zhuan Khye Koh, Bento Natura, and Neil Olver.

Thu, 25 Jan 2024

14:00 - 15:00
Lecture Room 3

Stress and flux-based finite element methods

Fleurianne Bertrand
(Chemnitz University of Technology)
Abstract

This talk explores recent advancements in stress and flux-based finite element methods. It focuses on addressing the limitations of traditional finite elements, in order to describe complex material behavior and engineer new metamaterials.

Stress and flux-based finite element methods are particularly useful in error estimation, laying the groundwork for adaptive refinement strategies. This concept builds upon the hypercircle theorem [1], which states that in a specific energy space, both the exact solution and any admissible stress field lie on a hypercircle. However, the construction of finite element spaces that satisfy admissible states for complex material behavior is not straightforward. It often requires a relaxation of specific properties, especially when dealing with non-symmetric stress tensors [2] or hyperelastic materials.

Alternatively, methods that directly approximate stresses can be employed, offering high accuracy of the stress fields and adherence to physical conservation laws. However, when approximating eigenvalues, this significant benefit for the solution's accuracy implies that the solution operator cannot be compact. To address this, the solution operator must be confined to a subset of the solution that excludes the stresses. Yet, due to compatibility conditions, the trial space for the other solution components typically does not yield the desired accuracy. The second part of this talk will therefore explore the Least-Squares method as a remedy to these challenges [3].

To conclude this talk, we will emphasize the integration of those methods within global solution strategies, with a particular focus on the challenges regarding model order reduction methods [4].

 

[1] W. Prager, J. Synge. Approximations in elasticity based on the concept of function space.

Quarterly of Applied Mathematics 5(3), 1947.

[2] FB, K. Bernhard, M. Moldenhauer, G. Starke. Weakly symmetric stress equilibration and a posteriori error estimation for linear elasticity, Numerical Methods for Partial Differential Equations 37(4), 2021.

[3] FB, D. Boffi. First order least-squares formulations for eigenvalue problems, IMA Journal of Numerical Analysis 42(2), 2023.

[4] FB, D. Boffi, A. Halim. A reduced order model for the finite element approximation of eigenvalue problems,Computer Methods in Applied Mechanics and Engineering 404, 2023.

 

Thu, 18 Jan 2024

14:00 - 15:00
Rutherford Appleton Laboratory, nr Didcot

A preconditioner with low-rank corrections based on the Bregman divergence

Andreas Bock
(Danish Technical University)
Abstract

We present a general framework for preconditioning Hermitian positive definite linear systems based on the Bregman log determinant divergence. This divergence provides a measure of discrepancy between a preconditioner and a target matrix, giving rise to

the study of preconditioners given as the sum of a Hermitian positive definite matrix plus a low-rank correction. We describe under which conditions the preconditioner minimises the $\ell^2$ condition number of the preconditioned matrix, and obtain the low-rank 

correction via a truncated singular value decomposition (TSVD). Numerical results from variational data assimilation (4D-VAR) support our theoretical results.

 

We also apply the framework to approximate factorisation preconditioners with a low-rank correction (e.g. incomplete Cholesky plus low-rank). In such cases, the approximate factorisation error is typically indefinite, and the low-rank correction described by the Bregman divergence is generally different from one obtained as a TSVD. We compare these two truncations in terms of convergence of the preconditioned conjugate gradient method (PCG), and show numerous examples where PCG converges to a small tolerance using the proposed preconditioner, whereas PCG using a TSVD-based preconditioner fails. We also consider matrices arising from interior point methods for linear programming that do not admit such an incomplete factorisation by default, and present a robust incomplete Cholesky preconditioner based on the proposed methodology.

The talk is based on papers with Martin S. Andersen (DTU).

 

Thu, 30 Nov 2023
14:00
Lecture Room 3

Multilevel adaptivity for stochastic finite element methods

Alex Bespalov
(Birmingham University)
Abstract

This talk concerns the design and analysis of adaptive FEM-based solution strategies for partial differential equations (PDEs) with uncertain or parameter-dependent inputs. We present two conceptually different strategies: one is projection-based (stochastic Galerkin FEM) and the other is sampling-based (stochastic collocation FEM). These strategies have emerged and become popular as effective alternatives to Monte-Carlo sampling in the context of (forward) uncertainty quantification. Both stochastic Galerkin and stochastic collocation approximations are typically represented as finite (sparse) expansions in terms of a parametric polynomial basis with spatial coefficients residing in finite element spaces. The focus of the talk is on multilevel approaches where different spatial coefficients may reside in different finite element spaces and, therefore, the underlying spatial approximations are allowed to be refined independently from each other.

 

We start with a more familiar setting of projection-based methods, where exploiting the Galerkin orthogonality property and polynomial approximations in terms of an orthonormal basis facilitates the design and analysis of adaptive algorithms. We discuss a posteriori error estimation as well as the convergence and rate optimality properties of the generated adaptive multilevel Galerkin approximations for PDE problems with affine-parametric coefficients. We then show how these ideas of error estimation and multilevel adaptivity can be applied in a non-Galerkin setting of stochastic collocation FEM, in particular, for PDE problems with non-affine parameterization of random inputs and for problems with parameter-dependent local spatial features.

 

The talk is based on a series of joint papers with Dirk Praetorius (TU Vienna), Leonardo Rocchi (Birmingham), Michele Ruggeri (University of Strathclyde, Glasgow), David Silvester (Manchester), and Feng Xu (Manchester).

Thu, 23 Nov 2023
14:00
Lecture Room 3

Making SGD parameter-free

Oliver Hinder
(University of Pittsburgh)
Abstract

We develop an algorithm for parameter-free stochastic convex optimization (SCO) whose rate of convergence is only a double-logarithmic factor larger than the optimal rate for the corresponding known-parameter setting. In contrast, the best previously known rates for parameter-free SCO are based on online parameter-free regret bounds, which contain unavoidable excess logarithmic terms compared to their known-parameter counterparts. Our algorithm is conceptually simple, has high-probability guarantees, and is also partially adaptive to unknown gradient norms, smoothness, and strong convexity. At the heart of our results is a novel parameter-free certificate for the step size of stochastic gradient descent (SGD), and a time-uniform concentration result that assumes no a-priori bounds on SGD iterates.

Additionally, we present theoretical and numerical results for a dynamic step size schedule for SGD based on a variant of this idea. On a broad range of vision and language transfer learning tasks our methods performance is close to that of SGD with tuned learning rate. Also, a per-layer variant of our algorithm approaches the performance of tuned ADAM.

This talk is based on papers with Yair Carmon and Maor Ivgi.

Thu, 16 Nov 2023

14:00 - 15:00
Lecture Room 3

Finite element schemes and mesh smoothing for geometric evolution problems

Bjorn Stinner
(University of Warwick)
Abstract

Geometric evolutions can arise as simple models or fundamental building blocks in various applications with moving boundaries and time-dependent domains, such as grain boundaries in materials or deforming cell boundaries. Mesh-based methods require adaptation and smoothing, particularly in the case of strong deformations. We consider finite element schemes based on classical approaches for geometric evolution equations but augmented with the gradient of the Dirichlet energy or a variant of it, which is known to produce a tangential mesh movement beneficial for the mesh quality. We focus on the one-dimensional case, where convergence of semi-discrete schemes can be proved, and discuss two cases. For networks forming triple junctions, it is desirable to keep the impact of any additional, mesh smoothing terms on the geometric evolution as small as possible, which can be achieved with a perturbation approach. Regarding the elastic flow of curves, the Dirichlet energy can serve as a replacement of the usual penalty in terms of the length functional in that, modulo rescaling, it yields the same minimisers in the long run.

Thu, 09 Nov 2023
14:00
Rutherford Appleton Laboratory, nr Didcot

Numerical shape optimization: a bit of theory and a bit of practice

Alberto Paganini
(University of Leicester)
Further Information

Please note this seminar is held at Rutherford Appleton Laboratory (RAL)

Rutherford Appleton Laboratory
Harwell Campus
Didcot
OX11 0QX

How to get to RAL

 

Abstract

We use the term shape optimization when we want to find a minimizer of an objective function that assigns real values to shapes of domains. Solving shape optimization problems can be quite challenging, especially when the objective function is constrained to a PDE, in the sense that evaluating the objective function for a given domain shape requires first solving a boundary value problem stated on that domain. The main challenge here is that shape optimization methods must employ numerical methods capable of solving a boundary value problem on a domain that changes after each iteration of the optimization algorithm.

 

The first part of this talk will provide a gentle introduction to shape optimization. The second part of this talk will highlight how the finite element framework leads to automated numerical shape optimization methods, as realized in the open-source library fireshape. The talk will conclude with a brief overview of some academic and industrial applications of shape optimization.

 

 

Thu, 02 Nov 2023
14:00
Lecture Room 3

Recent Developments in the Numerical Solution of PDE-Constrained Optimization Problems

John Pearson
(Edinburgh University)
Abstract

Optimization problems subject to PDE constraints constitute a mathematical tool that can be applied to a wide range of scientific processes, including fluid flow control, medical imaging, option pricing, biological and chemical processes, and electromagnetic inverse problems, to name a few. These problems involve minimizing some function arising from a particular physical objective, while at the same time obeying a system of PDEs which describe the process. It is necessary to obtain accurate solutions to such problems within a reasonable CPU time, in particular for time-dependent problems, for which the “all-at-once” solution can lead to extremely large linear systems.

 

In this talk we consider iterative methods, in particular Krylov subspace methods, to solve such systems, accelerated by fast and robust preconditioning strategies. In particular, we will survey several new developments, including block preconditioners for fluid flow control problems, a circulant preconditioning framework for solving certain optimization problems constrained by fractional differential equations, and multiple saddle-point preconditioners for block tridiagonal linear systems. We will illustrate the benefit of using these new approaches through a range of numerical experiments.

 

This talk is based on work with Santolo Leveque (Scuola Normale Superiore, Pisa), Spyros Pougkakiotis (Yale University), Jacek Gondzio (University of Edinburgh), and Andreas Potschka (TU Clausthal).

Thu, 26 Oct 2023
14:00
Lecture Room 3

Algebraic domain-decomposition preconditioners for the solution of linear systems

Tyrone Rees
(Rutherford Appleton Laboratory)
Abstract

The need to solve linear systems of equations is ubiquitous in scientific computing. Powerful methods for preconditioning such systems have been developed in cases where we can exploit knowledge of the origin of the linear system; a recent example from the solution of systems from PDEs is the Gen-EO domain decomposition method which works well, but requires a non-trival amount of knowledge of the underlying problem to implement.  

In this talk I will present a new spectral coarse space that can be constructed in a fully-algebraic way, in contrast to most existing spectral coarse spaces, and will give a theoretical convergence result for Hermitian positive definite diagonally dominant matrices. Numerical experiments and comparisons against state-of-the-art preconditioners in the multigrid community show that the resulting two-level Schwarz preconditioner is efficient, especially for non-self-adjoint operators. Furthermore, in this case, our proposed preconditioner outperforms state-of-the-art preconditioners.

This is joint work with Hussam Al Daas, Pierre Jolivet and Jennifer Scott.

Thu, 19 Oct 2023

14:00 - 15:00
Lecture Room 3

Randomized Least Squares Optimization and its Incredible Utility for Large-Scale Tensor Decomposition

Tammy Kolda
(mathsci.ai)
Abstract

Randomized least squares is a promising method but not yet widely used in practice. We show an example of its use for finding low-rank canonical polyadic (CP) tensor decompositions for large sparse tensors. This involves solving a sequence of overdetermined least problems with special (Khatri-Rao product) structure.

In this work, we present an application of randomized algorithms to fitting the CP decomposition of sparse tensors, solving a significantly smaller sampled least squares problem at each iteration with probabilistic guarantees on the approximation errors. We perform sketching through leverage score sampling, crucially relying on the fact that the problem structure enable efficient sampling from overestimates of the leverage scores with much less work. We discuss what it took to make the algorithm practical, including general-purpose improvements.

Numerical results on real-world large-scale tensors show the method is faster than competing methods without sacrificing accuracy.

*This is joint work with Brett Larsen, Stanford University.

Thu, 12 Oct 2023

14:00 - 15:00
Lecture Room 3

Hermitian preconditioning for a class of non-Hermitian linear systems

Nicole Spillane
(Ecole Polytechnique (CMAP))
Abstract

This work considers weighted and preconditioned GMRES. The objective is to provide a way of choosing the preconditioner and the inner product, also called weight, that ensure fast convergence. The main focus of the article is on Hermitian preconditioning (even for non-Hermitian problems).

It is indeed proposed to choose a Hermitian preconditioner H, and to apply GMRES in the inner product induced by H. If moreover, the problem matrix A is positive definite, then a new convergence bound is proved that depends only on how well H preconditions the Hermitian part of A, and on a measure of how non-Hermitian A is. In particular, if a scalable preconditioner is known for the Hermitian part of A, then the proposed method is also scalable. I will also illustrate this result numerically.