Forthcoming events in this series


Thu, 12 Dec 2024
14:00
(This talk is hosted by Rutherford Appleton Laboratory)

A Subspace-conjugate Gradient Method for Linear Matrix Equations

Davide Palitta
(Università di Bologna)
Abstract

 
The solution of multiterm linear matrix equations is still a very challenging task in numerical linear algebra.
If many different solid methods for equations with (at most) two terms exist in the literature, having a number of terms greater than two makes the numerical treatment of these equations much trickier. Very few options are available in the literature. In particular, to the best of our knowledge, no decomposition-based method for multiterm equations has never been proposed; only iterative procedures exist.
A non-complete list of contributions in this direction includes a greedy procedure designed by Sirkovi\'c and Kressner, projection methods tailored to the equation at hand, Riemannian optimization schemes, and matrix-oriented Krylov methods with low-rank truncations. The last class of solvers is probably one of the most commonly used ones. These schemes amount to adapting standard Krylov schemes for linear systems to matrix equations by leveraging the equivalence between matrix equations and their Kronecker form.
As long as no truncations are performed, this equivalence means that the algorithm itself is not exploiting the structure of the problem as it is not able to see that we are actually solving a matrix equation and not a linear system. The low-rank truncations we may perform in the matrix-equation framework can be viewed as a simple computational tool needed to make the solution process affordable in terms of storage allocation and not as an algorithmic advance.

 
By taking inspiration from the matrix-oriented cg method, our goal is to design a novel iterative scheme for the solution of multiterm matrix equations. We name this procedure the subspace-conjugate gradient method (Ss-cg) due to the peculiar orthogonality conditions imposed to compute some of the quantities involved in the scheme. As we will show in this talk, the main difference between Ss-cg and cg is the ability of the former to capitalize on the matrix equation structure of the underlying problem. In particular, we will make full use of the (low-rank) matrix format of the iterates to define appropriate ``step-sizes''. If these quantities correspond to scalars alpha_k and beta_k in cg, they will amount to small-dimensional matrices in our fresh approach.
This novel point of view leads to remarkable computational gains making Ss-cg a very competitive option for the solution of multi-term linear matrix equations.

 
This is a joint work with Martina Iannacito and Valeria Simoncini, both from the Department of Mathematics, University of Bologna.
 
Thu, 05 Dec 2024

14:00 - 15:00
Lecture Room 3

Solving (algebraic problems from) PDEs; a personal perspective

Andy Wathen
(Oxford University)
Abstract

We are now able to solve many partial differential equation problems that were well beyond reach when I started in academia. Some of this success is due to computer hardware but much is due to algorithmic advances. 

I will give a personal perspective of the development of computational methodology in this area over my career thus far. 

Thu, 28 Nov 2024

14:00 - 15:00
Lecture Room 3

Unleashing the Power of Deeper Layers in LLMs

Shiwei Liu
(Oxford University)
Abstract

Large Language Models (LLMs) have demonstrated impressive achievements. However, recent research has shown that their deeper layers often contribute minimally, with effectiveness diminishing as layer depth increases. This pattern presents significant opportunities for model compression. 

In the first part of this seminar, we will explore how this phenomenon can be harnessed to improve the efficiency of LLM compression and parameter-efficient fine-tuning. Despite these opportunities, the underutilization of deeper layers leads to inefficiencies, wasting resources that could be better used to enhance model performance. 

The second part of the talk will address the root cause of this ineffectiveness in deeper layers and propose a solution. We identify the issue as stemming from the prevalent use of Pre-Layer Normalization (Pre-LN) and introduce Mix-Layer Normalization (Mix-LN) with combined Pre-LN and Post-LN as a new approach to mitigate this training deficiency.

Thu, 21 Nov 2024

14:00 - 15:00
Lecture Room 3

Tackling complexity in multiscale kinetic and mean-field equations

Lorenzo Pareschi
(Heriot Watt University)
Abstract

Kinetic and mean-field equations are central to understanding complex systems across fields such as classical physics, engineering, and the socio-economic sciences. Efficiently solving these equations remains a significant challenge due to their high dimensionality and the need to preserve key structural properties of the models. 

In this talk, we will focus on recent advancements in deterministic numerical methods, which provide an alternative to particle-based approaches (such as Monte Carlo or particle-in-cell methods) by avoiding stochastic fluctuations and offering higher accuracy. We will discuss strategies for designing these methods to reduce computational complexity while preserving fundamental physical properties and maintaining efficiency in stiff regimes. 
Special attention will be given to the role of these methods in addressing multi-scale problems in rarefied gas dynamics and plasma physics. Time permitting, we will also touch on emerging techniques for uncertainty quantification in these systems.

Thu, 14 Nov 2024

14:00 - 15:00
Lecture Room 3

Group discussion on the use of AI tools in research

Mike Giles
(Oxford University)
Abstract

AI tools like ChatGPT, Microsoft Copilot, GitHub Copilot, Claude and even older AI-enabled tools like Grammarly and MS Word, are becoming an everyday part of our research environment.  This last-minute opening up of a seminar slot due to the unfortunate illness of our intended speaker (who will hopefully re-schedule for next term) gives us an opportunity to discuss what this means for us as researchers; what are good helpful uses of AI, and are there uses of AI which we might view as inappropriate?  Please come ready to participate with examples of things which you have done yourselves with AI tools.

Thu, 07 Nov 2024

14:00 - 15:00
Lecture Room 3

Multilevel Monte Carlo methods

Mike Giles
(Oxford University)
Abstract

In this seminar I will begin by giving an overview of some problems in stochastic simulation and uncertainty quantification. I will then outline the Multilevel Monte Carlo for situations in which accurate simulations are very costly, but it is possible to perform much cheaper, less accurate simulations.  Inspired by the multigrid method, it is possible to use a combination of these to achieve the desired overall accuracy at a much lower cost.

Thu, 31 Oct 2024

14:00 - 15:00
Lecture Room 3

Theory to Enable Practical Quantum Advantage

Balint Koczor
(Oxford University)
Abstract

Quantum computers are becoming a reality and current generations of machines are already well beyond the 50-qubit frontier. However, hardware imperfections still overwhelm these devices and it is generally believed the fault-tolerant, error-corrected systems will not be within reach in the near term: a single logical qubit needs to be encoded into potentially thousands of physical qubits which is prohibitive.

 

Due to limited resources, in the near term, hybrid quantum-classical protocols are the most promising candidates for achieving early quantum advantage and these need to resort to quantum error mitigation techniques. I will explain the basic concepts and introduce hybrid quantum-classical protocols are the most promising candidates for achieving early quantum advantage. These have the potential to solve real-world problems---including optimisation or ground-state search---but they suffer from a large number of circuit repetitions required to extract information from the quantum state. I will finally identify the most likely areas where quantum computers may deliver a true advantage in the near term.

 

Bálint Koczor

Associate Professor in Quantum Information Theory

Mathematical Institute, University of Oxford

webpage

Thu, 24 Oct 2024

14:00 - 15:00
(This talk is hosted by Rutherford Appleton Laboratory)

Machine learning in solution of inverse problems: subjective perspective

Marta Betcke
(University College London)
Abstract

Following the 2012 breakthrough in deep learning for classification and visions problems, the last decade has seen tremendous raise of interest in machine learning in a wider mathematical research community from foundational research through field specific analysis to applications. 

As data is at the core of any inverse problem, it was a natural direction for the field to investigate how machine learning could aid various aspects of inversion yielding numerous approaches from somewhat ad-hoc but very effective like learned unrolled methods to provably convergent learned regularisers with everything in between. In this talk I will review some on these developments through a lens of the research of our group.   

 

Thu, 17 Oct 2024

14:00 - 15:00
Lecture Room 3

On the loss of orthogonality in low-synchronization variants of reorthogonalized block classical Gram-Schmidt

Kathryn Lund
(STFC Rutherford Appleton Laboratory)
Abstract
Interest in communication-avoiding orthogonalization schemes for high-performance computing has been growing recently.  We address open questions about the numerical stability of various block classical Gram-Schmidt variants that have been proposed in the past few years.  An abstract framework is employed, the flexibility of which allows for new rigorous bounds on the loss of orthogonality in these variants. We first analyse a generalization of (reorthogonalized) block classical Gram-Schmidt and show that a "strong'' intrablock orthogonalization routine is only needed for the very first block in order to maintain orthogonality on the level of the unit roundoff. 
Using this variant, which has four synchronization points per block column, we remove the synchronization points one at a time and analyse how each alteration affects the stability of the resulting method. Our analysis shows that the variant requiring only one synchronization per block column cannot be guaranteed to be stable in practice, as stability begins to degrade with the first reduction of synchronization points.
Our analysis of block methods also provides new theoretical results for the single-column case. In particular, it is proven that DCGS2 from Bielich, D. et al. {Par. Comput.} 112 (2022)] and CGS-2 from Swirydowicz, K. et al, {Num. Lin. Alg. Appl.} 28 (2021)] are as stable as Householder QR.  
Numerical examples from the BlockStab toolbox are included throughout, to help compare variants and illustrate the effects of different choices of intraorthogonalization subroutines.


 

Fri, 20 Sep 2024

14:00 - 15:00
TCC VC

Finite element approximation of eigenvalue problems

Prof Danielle Boffi
(KAUST - Computer, Electrical and Mathematical Sciences and Engineering - CEMSE)
Abstract

In this informal talk I will review some theoretical and practical aspects related to the finite element approximation of eigenvalue problems arising from PDEs.
The review will cover elliptic eigenvalue problems and eigenvalue problems in mixed form, with particular emphasis on the Maxwell eigenvalue problem.
Other topics can be discussed depending on the interests of the audience, including adaptive schemes, approximation of parametric problems, reduced order models.
 

Thu, 13 Jun 2024

14:00 - 15:00
Lecture Room 3

A New Two-Dimensional Model-Based Subspace Method for Large-Scale Unconstrained Derivative-Free Optimization: 2D-MoSub

Pengcheng Xie
(Chinese Academy of Sciences)
Abstract

This seminar will introduce 2D-MoSub, a derivative-free optimization method based on the subspace method and quadratic models, specifically tackling large-scale derivative-free problems. 2D-MoSub combines 2-dimensional quadratic interpolation models and trust-region techniques to update the points and explore the 2-dimensional subspace iteratively. Its framework includes constructing the interpolation set, building the quadratic interpolation model, performing trust-region trial steps, and updating the trust-region radius and subspace. Computation details and theoretical properties will be discussed. Numerical results demonstrate the advantage of 2D-MoSub.

 

Short Bio:
Pengcheng Xie, PhD (Chinese Academy of Sciences), is joining Lawrence Berkeley National Laboratory as a postdoctoral scholar specializing in mathematical optimization and numerical analysis. He has developed optimization methods, including 2D-MoSub and SUSD-TR. Pengcheng has published in major journals and presented at ISMP 2024 (upcoming), ICIAM 2023, and CSIAM 2022. He received the Hua Loo-keng scholarship in 2019 and the CAS-AMSS Presidential scholarship in 2023.
 

Thu, 06 Jun 2024

14:00 - 15:00
Lecture Room 3

Structure-preserving hybrid finite element methods

Ari Stern
(Washington University in St. Louis, USA)
Abstract

The classical finite element method uses piecewise-polynomial function spaces satisfying continuity and boundary conditions. Hybrid finite element methods, by contrast, drop these continuity and boundary conditions from the function spaces and instead enforce them weakly using Lagrange multipliers. The hybrid approach has several numerical and implementational advantages, which have been studied over the last few decades.

 

In this talk, we show how the hybrid perspective has yielded new insights—and new methods—in structure-preserving numerical PDEs. These include multisymplectic methods for Hamiltonian PDEs, charge-conserving methods for the Maxwell and Yang-Mills equations, and hybrid methods in finite element exterior calculus.

Thu, 30 May 2024

14:00 - 15:00
Rutherford Appleton Laboratory, nr Didcot

This seminar has been cancelled

Marta Betcke
(University College London)
Abstract

Joint work Marta Betcke and Bolin Pan

In photoacoustic tomography (PAT) with flat sensor, we routinely encounter two types of limited data. The first is due to using a finite sensor and is especially perceptible if the region of interest is large relatively to the sensor or located farther away from the sensor. In this talk we focus on the second type caused by a varying sensitivity of the sensor to the incoming wavefront direction which can be modelled as binary i.e. by a cone of sensitivity. Such visibility conditions result, in Fourier domain, in a restriction of the data to a bowtie, akin to the one corresponding to the range of the forward operator but further narrowed according to the angle of sensitivity. 

We show how we can separate the visible and invisible wavefront directions in PAT image and data using a directional frame like Curvelets, and how such decomposition allows for decoupling of the reconstruction involving application of expensive forward/adjoint solvers from the training problem. We present fast and stable approximate Fourier domain forward and adjoint operators for reconstruction of the visible coefficients for such limited angle problem and a tailored UNet matching both the multi-scale Curvelet decomposition and the partition into the visible/invisible directions for learning the invisible coefficients from a training set of similar data.

Thu, 23 May 2024

14:00 - 15:00
Lecture Room 3

The bilevel optimization renaissance through machine learning: lessons and challenges

Alain Zemkoho
(University of Southampton)
Abstract

Bilevel optimization has been part of machine learning for over 4 decades now, although perhaps not always in an obvious way. The interconnection between the two topics started appearing more clearly in publications since about 20 years now, and in the last 10 years, the number of machine learning applications of bilevel optimization has literally exploded. This rise of bilevel optimization in machine learning has been highly positive, as it has come with many innovations in the theoretical and numerical perspectives in understanding and solving the problem, especially with the rebirth of the implicit function approach, which seemed to have been abandoned at some point.
Overall, machine learning has set the bar very high for the whole field of bilevel optimization with regards to the development of numerical methods and the associated convergence analysis theory, as well as the introduction of efficient tools to speed up components such as derivative calculations among other things. However, it remains unclear how the techniques from the machine learning—based bilevel optimization literature can be extended to other applications of bilevel programming. 
For instance, many machine learning loss functions and the special problem structures enable the fulfillment of some qualification conditions that will fail for multiple other applications of bilevel optimization. In this talk, we will provide an overview of machine learning applications of bilevel optimization while giving a flavour of corresponding solution algorithms and their limitations. 
Furthermore, we will discuss possible paths for algorithms that can tackle more complicated machine learning applications of bilevel optimization, while also highlighting lessons that can be learned for more general bilevel programs.

Thu, 16 May 2024

14:00 - 15:00
Lecture Room 3

Multilevel Monte Carlo methods for the approximation of failure probability regions

Matteo Croci
(Basque Center for Applied Mathematics)
Abstract

In this talk, we consider the problem of approximating failure regions. More specifically, given a costly computational model with random parameters and a failure condition, our objective is to determine the parameter region in which the failure condition is likely to not be satisfied. In mathematical terms, this problem can be cast as approximating the level set of a probability density function. We solve this problem by dividing it into two: 1) The design of an efficient Monte Carlo strategy for probability estimation. 2) The construction of an efficient algorithm for level-set approximation. Following this structure, this talk is comprised of two parts:

In the first part, we present a new multi-output multilevel best linear unbiased estimator (MLBLUE) for approximating expectations. The advantage of this estimator is in its convenience and optimality: Given any set of computational models with known covariance structure, MLBLUE automatically constructs a provenly optimal estimator for any (finite) number of quantities of interest. Nevertheless, the optimality of MLBLUE is tied to its optimal set-up, which requires the solution of a nonlinear optimization problem. We show how the latter can be reformulated as a semi-definite program and thus be solved reliably and efficiently.

In the second part, we construct an adaptive level-set approximation algorithm for smooth functions corrupted by noise in $\mathbb{R}^d$. This algorithm only requires point value data and is thus compatible with Monte Carlo estimators. The algorithm is comprised of a criterion for level-set adaptivity combined with an a posteriori error estimator. Under suitable assumptions, we can prove that our algorithm will correctly capture the target level set at the same cost complexity of uniformly approximating a $(d-1)$-dimensional function.

Thu, 09 May 2024

14:00 - 15:00
Lecture Room 4

Fast optimistic methods for monotone equations and convex optimization problems

Radu Bot
(University of Vienna)
Further Information

 

Please note; the seminar is taking place in Lecture Room 4 on this occasion 

Abstract

In this talk, we discuss continuous in time dynamics for the problem of approaching the set of zeros of a single-valued monotone and continuous operator V . Such problems are motivated by minimax convexconcave and, in particular, by convex optimization problems with linear constraints. The central role is played by a second-order dynamical system that combines a vanishing damping term with the time derivative of V along the trajectory, which can be seen as an analogous of the Hessian-driven damping in case the operator is originating from a potential. We show that these methods exhibit fast convergence rates for kV (z(t))k as t ! +1, where z( ) denotes the generated trajectory, and for the restricted gap function, and that z( ) converges to a zero of the operator V . For the corresponding implicit and explicit discrete time models with Nesterov’s momentum, we prove that they share the asymptotic features of the continuous dynamics.

Extensions to variational inequalities and fixed-point problems are also addressed. The theoretical results are illustrated by numerical experiments on bilinear games and the training of generative adversarial networks.

Thu, 02 May 2024

14:00 - 15:00
Lecture Room 3

Mathematics: key enabling technology for scientific machine learning

Wil Schilders
(TU Eindhoven)
Abstract

Artificial Intelligence (AI) will strongly determine our future prosperity and well-being. Due to its generic nature, AI will have an impact on all sciences and business sectors, our private lives and society as a whole. AI is pre-eminently a multidisciplinary technology that connects scientists from a wide variety of research areas, from behavioural science and ethics to mathematics and computer science.

Without downplaying the importance of that variety, it is apparent that mathematics can and should play an active role. All the more so as, alongside the successes of AI, also critical voices are increasingly heard. As Robert Dijkgraaf (former director of the Princeton Institute of Advanced Studies) observed in May 2019: ”Artificial intelligence is in its adolescent phase, characterised by trial and error, self-aggrandisement, credulity and lack of systematic understanding.” Mathematics can contribute to the much-needed systematic understanding of AI, for example, greatly improving reliability and robustness of AI algorithms, understanding the operation and sensitivity of networks, reducing the need for abundant data sets, or incorporating physical properties into neural networks needed for super-fast and accurate simulations in the context of digital twinning.

Mathematicians absolutely recognize the potential of artificial intelligence, machine learning and (deep) neural networks for future developments in science, technology and industry. At the same time, a sound mathematical treatment is essential for all aspects of artificial intelligence, including imaging, speech recognition, analysis of texts or autonomous driving, implying it is essential to involve mathematicians in all these areas. In this presentation, we highlight the role of mathematics as a key enabling technology within the emerging field of scientific machine learning. Or, as I always say: ”Real intelligence is needed to make artificial intelligence work.”

 

Thu, 25 Apr 2024

14:00 - 15:00
Lecture Room 3

ESPIRA: Estimation of Signal Parameters via Iterative Rational Approximation

Nadiia Derevianko
(University of Göttingen)
Abstract

We introduce a new method - ESPIRA (Estimation of Signal Parameters via Iterative Rational Approximation) \cite{DP22,  DPP21} - for the recovery of complex exponential  sums
$$
f(t)=\sum_{j=1}^{M} \gamma_j \mathrm{e}^{\lambda_j t},
$$
that are determined by a finite number of parameters: the order $M$, weights $\gamma_j \in \mathbb{C} \setminus \{0\}$ and nodes  $\mathrm{e}^{\lambda_j} \in \mathbb{C}$ for $j=1,...,M$.  Our new recovery procedure is based on the observation that Fourier coefficients (or DFT coefficients) of exponential sums have a special rational structure.  To  reconstruct this structure in a stable way we use the AAA algorithm  proposed by Nakatsukasa et al.   We show that ESPIRA can be interpreted as a matrix pencil method applied to Loewner matrices. 

During the talk we will demonstrate that ESPIRA outperforms Prony-like methods such as ESPRIT and MPM for noisy data and for signal approximation by short exponential sums.  

 

Bibliography
N. Derevianko,  G.  Plonka, 
Exact reconstruction of extended exponential sums using rational approximation of their Fourier coefficients, Anal.  Appl.,  20(3),  2022,  543-577.


N. Derevianko,  G. Plonka,  M. Petz, 
From ESPRIT to ESPIRA: Estimation of signal parameters by iterative rational approximation,   IMA J. Numer. Anal.,  43(2),  2023, 789--827.  


Y. Nakatsukasa, O. Sète,   L.N. Trefethen,  The AAA algorithm for rational approximation.
SIAM J. Sci. Comput., 40(3),   2018,  A1494–A1522.  

Mon, 08 Apr 2024

11:00 - 12:00
Lecture Room 3

Heavy-Tailed Large Deviations and Sharp Characterization of Global Dynamics of SGDs in Deep Learning

Chang-Han Rhee
(Northwestern University, USA)
Abstract

While the typical behaviors of stochastic systems are often deceptively oblivious to the tail distributions of the underlying uncertainties, the ways rare events arise are vastly different depending on whether the underlying tail distributions are light-tailed or heavy-tailed. Roughly speaking, in light-tailed settings, a system-wide rare event arises because everything goes wrong a little bit as if the entire system has conspired up to provoke the rare event (conspiracy principle), whereas, in heavy-tailed settings, a system-wide rare event arises because a small number of components fail catastrophically (catastrophe principle). In the first part of this talk, I will introduce the recent developments in the theory of large deviations for heavy-tailed stochastic processes at the sample path level and rigorously characterize the catastrophe principle for such processes. 

The empirical success of deep learning is often attributed to the mysterious ability of stochastic gradient descents (SGDs) to avoid sharp local minima in the loss landscape, as sharp minima are believed to lead to poor generalization. To unravel this mystery and potentially further enhance such capability of SGDs, it is imperative to go beyond the traditional local convergence analysis and obtain a comprehensive understanding of SGDs' global dynamics within complex non-convex loss landscapes. In the second part of this talk, I will characterize the global dynamics of SGDs building on the heavy-tailed large deviations and local stability framework developed in the first part. This leads to the heavy-tailed counterparts of the classical Freidlin-Wentzell and Eyring-Kramers theories. Moreover, we reveal a fascinating phenomenon in deep learning: by injecting and then truncating heavy-tailed noises during the training phase, SGD can almost completely avoid sharp minima and hence achieve better generalization performance for the test data.  

 

This talk is based on the joint work with Mihail Bazhba, Jose Blanchet, Bohan Chen, Sewoong Oh, Zhe Su, Xingyu Wang, and Bert Zwart.

Thu, 07 Mar 2024

14:00 - 15:00
Lecture Room 3

Stabilized Lagrange-Galerkin schemes for viscous and viscoelastic flow problems

Hirofumi Notsu
(Kanazawa University)
Abstract

Many researchers are developing stable and accurate numerical methods for flow problems, which roughly belong to upwind methods or characteristics(-based) methods. 
The Lagrange-Galerkin method proposed and analyzed in, e.g., [O. Pironneau. NM, 1982] and [E. S\"uli. NM, 1988] is the finite element method combined with the idea of the method of characteristics; hence, it belongs to the characteristics(-based) methods. The advantages are the CFL-free robustness for convection-dominated problems and the symmetry of the resulting coefficient matrix. In this talk, we introduce stabilized Lagrange-Galerkin schemes of second order in time for viscous and viscoelastic flow problems, which employ the cheapest conforming P1-element with the help of pressure-stabilization [F. Brezzi and J. Pitk\"aranta. Vieweg+Teubner, 1984] for all the unknown functions, i.e., velocity, pressure, and conformation tensor, reducing the number of DOFs. 
Focusing on the recent developments of discretizations of the (non-conservative and conservative) material derivatives and the upper-convected time derivative, we present theoretical and numerical results.

Thu, 29 Feb 2024

14:00 - 15:00
Lecture Room 3

On the use of "conventional" unconstrained minimization solvers for training regression problems in scientific machine learning

Stefano Zampini
(King Abdullah University of Science and Technology (KAUST))
Abstract

In recent years, we have witnessed the emergence of scientific machine learning as a data-driven tool for the analysis, by means of deep-learning techniques, of data produced by computational science and engineering applications.  At the core of these methods is the supervised training algorithm to learn the neural network realization, a highly non-convex optimization problem that is usually solved using stochastic gradient methods.

However, distinct from deep-learning practice, scientific machine-learning training problems feature a much larger volume of smooth data and better characterizations of the empirical risk functions, which make them suited for conventional solvers for unconstrained optimization.

In this talk, we empirically demonstrate the superior efficacy of a trust region method based on the Gauss-Newton approximation of the Hessian in improving the generalization errors arising from regression tasks when learning surrogate models for a wide range of scientific machine-learning techniques and test cases. All the conventional solvers tested, including L-BFGS and inexact Newton with line-search, compare favorably, either in terms of cost or accuracy, with the adaptive first-order methods used to validate the surrogate models.

Thu, 22 Feb 2024

14:00 - 15:00
Lecture Room 3

Hierarchical adaptive low-rank format with applications to discretized PDEs

Leonardo Robol
(University of Pisa)
Abstract

A novel framework for hierarchical low-rank matrices is proposed that combines an adaptive hierarchical partitioning of the matrix with low-rank approximation. One typical application is the approximation of discretized functions on rectangular domains; the flexibility of the format makes it possible to deal with functions that feature singularities in small, localized regions. To deal with time evolution and relocation of singularities, the partitioning can be dynamically adjusted based on features of the underlying data. Our format can be leveraged to efficiently solve linear systems with Kronecker product structure, as they arise from discretized partial differential equations (PDEs). For this purpose, these linear systems are rephrased as linear matrix equations and a recursive solver is derived from low-rank updates of such equations. 
We demonstrate the effectiveness of our framework for stationary and time-dependent, linear and nonlinear PDEs, including the Burgers' and Allen–Cahn equations.

This is a joint work with Daniel Kressner and Stefano Massei.

Thu, 15 Feb 2024
14:00

Algorithmic Insurance

Agni Orfanoudaki
(Oxford University Saïd Business School)
Abstract

As machine learning algorithms get integrated into the decision-making process of companies and organizations, insurance products are being developed to protect their providers from liability risk. Algorithmic liability differs from human liability since it is based on data-driven models compared to multiple heterogeneous decision-makers and its performance is known a priori for a given set of data. Traditional actuarial tools for human liability do not consider these properties, primarily focusing on the distribution of historical claims. We propose, for the first time, a quantitative framework to estimate the risk exposure of insurance contracts for machine-driven liability, introducing the concept of algorithmic insurance. Our work provides ML model developers and insurance providers with a comprehensive risk evaluation approach for this new class of products. Thus, we set the foundations of a niche area of research at the intersection of the literature in operations, risk management, and actuarial science. Specifically, we present an optimization formulation to estimate the risk exposure of a binary classification model given a pre-defined range of premiums. Our approach outlines how properties of the model, such as discrimination performance, interpretability, and generalizability, can influence the insurance contract evaluation. To showcase a practical implementation of the proposed framework, we present a case study of medical malpractice in the context of breast cancer detection. Our analysis focuses on measuring the effect of the model parameters on the expected financial loss and identifying the aspects of algorithmic performance that predominantly affect the risk of the contract.

Paper Reference: Bertsimas, D. and Orfanoudaki, A., 2021. Pricing algorithmic insurance. arXiv preprint arXiv:2106.00839.

Paper link: https://arxiv.org/pdf/2106.00839.pdf

Thu, 08 Feb 2024
14:00
Lecture Room 3

From Chebfun3 to RTSMS: A journey into deterministic and randomized Tucker decompositions

Behnam Hashemi
(Leicester University)
Abstract
The Tucker decomposition is a family of representations that break up a given d-dimensional tensor into the multilinear product of a core tensor and a factor matrix along each of the d-modes. It is a useful tool in extracting meaningful insights from complex datasets and has found applications in various fields, including scientific computing, signal processing and machine learning. 
 In this talk we will first focus on the continuous framework and revisit how Tucker decomposition forms the foundation of Chebfun3 for numerical computing with 3D functions and the deterministic algorithm behind Chebfun3. The key insight is that separation of variables achieved via low-rank Tucker decomposition simplifies and speeds up lots of subsequent computations.
 We will then switch to the discrete framework and discuss a new algorithm called RTSMS (randomized Tucker with single-mode sketching). The single-mode sketching aspect of RTSMS allows utilizing simple sketch matrices which are substantially smaller than alternative methods leading to considerable performance gains. Within its least-squares strategy, RTSMS incorporates leverage scores for efficiency with Tikhonov regularization and iterative refinement for stability. RTSMS is demonstrated to be competitive with existing methods, sometimes outperforming them by a large margin.
We illustrate the benefits of Tucker decomposition via MATLAB demos solving problems from global optimization to video compression. RTSMS is joint work with Yuji Nakatsukasa.
Thu, 01 Feb 2024
14:00
Lecture Room 3

A strongly polynomial algorithm for the minimum-cost generalized flow problem

Laszlo Vegh
(LSE)
Abstract

We give a strongly polynomial algorithm for minimum cost generalized flow, and as a consequence, for all linear programs with at most two nonzero entries per row, or at most two nonzero entries per column. While strongly polynomial algorithms for the primal and dual feasibility problems have been known for a long time, various combinatorial approaches used for those problems did not seem to carry over to the minimum-cost variant.

Our approach is to show that the ‘subspace layered least squares’ interior point method, an earlier joint work with Allamigeon, Dadush, Loho and Natura requires only a strongly polynomial number of iterations for minimum cost generalized flow. We achieve this by bounding the straight line complexity, introduced in the same paper. The talk will give an overview of the interior point method as well as the combinatorial straight-line complexity analysis for the particular setting. This is joint work with Daniel Dadush, Zhuan Khye Koh, Bento Natura, and Neil Olver.