Past Numerical Analysis Group Internal Seminar

Tue, 21 Jan 2020
14:00

Vandermonde with Arnoldi

Nick Trefethen

(Oxford)

Abstract

Vandermonde matrices are exponentially ill-conditioned, rendering the familiar “polyval(polyfit)” algorithm for polynomial interpolation and least-squares fitting ineffective at higher degrees. We show that Arnoldi orthogonalization fixes the problem.

Tue, 03 Dec 2019
14:30

Estimation of ODE models with discretization error quantification

Takeru Matsuda

(University of Tokyo)

Abstract

We consider estimation of ordinary differential equation (ODE) models from noisy observations. For this problem, one conventional approach is to fit numerical solutions (e.g., Euler, Runge–Kutta) of ODEs to data. However, such a method does not account for the discretization error in numerical solutions and has limited estimation accuracy. In this study, we develop an estimation method that quantifies the discretization error based on data. The key idea is to model the discretization error as random variables and estimate their variance simultaneously with the ODE parameter. The proposed method has the form of iteratively reweighted least squares, where the discretization error variance is updated with the isotonic regression algorithm and the ODE parameter is updated by solving a weighted least squares problem using the adjoint system. Experimental results demonstrate that the proposed method improves estimation accuracy by accounting for the discretization error in a data-driven manner. This is a joint work with Yuto Miyatake (Osaka University).

Tue, 03 Dec 2019
14:00

On symmetrizing the ultraspherical spectral method for self-adjoint problems

Mikael Slevinsky

(University of Manitoba)

Abstract

A mechanism is described to symmetrize the ultraspherical spectral method for self-adjoint problems. The resulting discretizations are symmetric and banded. An algorithm is presented for an adaptive spectral decomposition of self-adjoint operators. Several applications are explored to demonstrate the properties of the symmetrizer and the adaptive spectral decomposition.

Tue, 26 Nov 2019
14:30

State-of-the-art Linear Algebra methods can bring significant speedups to ADMM

Nikitas Rontsis

(Oxford)

Abstract

The Alternating Directions Method of Multipliers (ADMM) is a widely popular first-order method for solving convex optimization problems. Its simplicity is arguably one of the main reasons for its popularity. For a broad class of problems, ADMM iterates by repeatedly solving perhaps the two most standard Linear Algebra problems: linear systems and symmetric eigenproblems. In this talk, we discuss how employing standard Krylov-subspace methods for ADMM can lead to tenfold speedups while retaining convergence guarantees.

Tue, 26 Nov 2019
14:00

Subspace Gauss-Newton for Nonlinear Least-Squares

Constantin Puiu

(Oxford)

Abstract

Subspace methods have the potential to outperform conventional methods, as the derivatives only need to be computed in a smaller dimensional subspace. The sub-problem that needs to be solved at each iteration is also smaller in size, and thus the Linear Algebra cost is also lower. However, if the subspace is not selected "properly", the progress per iteration can be significantly much lower than the progress of the equivalent full-space method. Thus, "improper" selection of the subspace results in subspace methods which are actually more computationally expensive per unit of progress than their full-space alternatives. The popular subspace selection methods (such as randomized) fall into this category when the objective function does not have a known (exploitable) structure. We provide a simple and effective rule to choose the subspace in the "right way" when the objective function does not have a structure. We focus on Gauss-Newton and Least-Squares, but the idea can be generalised to any other solvers and/or objective functions. We show theoretically that the cost of this strategy per unit progress lies in between (approximately) 50% and 100% of the cost of Gauss-Newton, and give an intuition why in practice, it should be closer to the favorable end of the spectrum. We confirm these expectations by running numerical experiments on the CUTEst32 test set. We also compare the proposed selection method with randomized subspace selection. We briefly show that the method is globally convergent and has a 2-step quadratic asymptotic rate of convergence for zero-residual problems.

Tue, 19 Nov 2019
14:30

An approximate message passing algorithm for compressed sensing MRI

Charles Millard

(Oxford)

Abstract

The Approximate Message Passing (AMP) algorithm is a powerful iterative method for reconstructing undersampled sparse signals. Unfortunately, AMP is sensitive to the type of sensing matrix employed and frequently encounters convergence problems. One case where AMP tends to fail is compressed sensing MRI, where Fourier coefficients of a natural image are sampled with variable density. An AMP-inspired algorithm constructed specifically for MRI is presented that exhibits a 'state evolution', where at every iteration the image estimate before thresholding behaves as the ground truth corrupted by Gaussian noise with known covariance. Numerical experiments explore the practical benefits of such effective noise behaviour.

Tue, 19 Nov 2019
14:00

Quotient-Space Boundary Element Methods for Scattering at Multi-Screens

Carolina Urzua Torres

(Oxford)

Abstract

Boundary integral equations (BIEs) are well established for solving scattering at bounded infinitely thin objects, so-called screens, which are modelled as “open surfaces” in 3D and as “open curves” in 2D. Moreover, the unknowns of these BIEs are the jumps of traces across $\Gamma$. Things change considerably when considering scattering at multi-screens, which are arbitrary arrangements of thin panels that may not be even locally orientable because of junction points (2D) or junction lines (3D). Indeed, the notion of jumps of traces is no longer meaningful at these junctions. This issue can be solved by switching to a quotient space perspective of traces, as done in recent work by Claeys and Hiptmair. In this talk, we present the extension of the quotient-space approach to the Galerkin boundary element (BE) discretization of first-kind BIEs. Unlike previous approaches, the new quotient-space BEM relies on minimal geometry information and does not require any special treatment at junctions. Moreover, it allows for a rigorous numerical analysis.

Tue, 12 Nov 2019
14:30

Overview of a quotient geometry with simple geodesics for the manifold of fixed-rank positive-semidefinite matrices

Estelle Massart

(UC Louvain)

Abstract

We describe the main geometric tools required to work on the manifold of fixed-rank symmetric positive-semidefinite matrices: we present expressions for the Riemannian logarithm and the injectivity radius, to complement the already known Riemannian exponential. This manifold is particularly relevant when dealing with low-rank approximations of large positive-(semi)definite matrices. The manifold is represented as a quotient of the set of full-rank rectangular matrices (endowed with the Euclidean metric) by the orthogonal group. Our results allow understanding the failure of some curve fitting algorithms, when the rank of the data is overestimated. We illustrate these observations on a dataset made of covariance matrices characterizing a wind field.

Tue, 12 Nov 2019
14:00

Computing multiple local minima of topology optimisation problems

Ioannis Papadopoulos

(Oxford)

Abstract

Topology optimisation finds the optimal material distribution of a fluid or solid in a domain, subject to PDE and volume constraints. There are many formulations and we opt for the density approach which results in a PDE, volume and inequality constrained, non-convex, infinite-dimensional optimisation problem without a priori knowledge of a good initial guess. Such problems can exhibit many local minima or even no minima. In practice, heuristics are used to obtain the global minimum, but these can fail even in the simplest of cases. In this talk, we will present an algorithm that solves such problems and systematically discovers as many of these local minima as possible along the way.

Tue, 05 Nov 2019
14:30

Parameter Optimization in a Global Ocean Biogeochemical Model

Sophy Oliver

(Oxford)

Abstract

Ocean biogeochemical models used in climate change predictions are very computationally expensive and heavily parameterised. With derivatives too costly to compute, we optimise the parameters within one such model using derivative-free algorithms with the aim of finding a good optimum in the fewest possible function evaluations. We compare the performance of the evolutionary algorithm CMA-ES which is a stochastic global optimization method requiring more function evaluations, to the Py-BOBYQA and DFO-LS algorithms which are local derivative-free solvers requiring fewer evaluations. We also use initial Latin Hypercube sampling to then provide DFO-LS with a good starting point, in an attempt to find the global optimum with a local solver. This is joint work with Coralia Cartis and Samar Khatiwala.

Tue, 05 Nov 2019
14:00

Globally convergent least-squares optimisation methods for variational data assimilation

Maha Kaouri

(University of Reading)

Abstract

The variational data assimilation (VarDA) problem is usually solved using a method equivalent to Gauss-Newton (GN) to obtain the initial conditions for a numerical weather forecast. However, GN is not globally convergent and if poorly initialised, may diverge such as when a long time window is used in VarDA; a desirable feature that allows the use of more satellite data. To overcome this, we apply two globally convergent GN variants (line search & regularisation) to the long window VarDA problem and show when they locate a more accurate solution versus GN within the time and cost available.
Joint work with Coralia Cartis, Amos S. Lawless, Nancy K. Nichols.

Tue, 29 Oct 2019
14:30

Deciphering pattern formation via normal forms

Priya Subramanian

(Oxford)

Abstract

Complex spatial patterns such as superlattice patterns and quasipatterns occur in a variety of physical systems ranging from vibrated fluid layers to crystallising soft matter. Reduced order models that describe such systems are usually PDEs. Close to a phase transition, modal expansion along with perturbation methods can be applied to convert the PDEs to normal form equations in the form of coupled ODEs. I use equivariant bifurcation theory along with homotopy methods (developed in computational algebraic geometry) to obtain all solutions of the normal form equations in a non-iterative method. I want to talk about how this approach allows us to ask new questions about the physical systems of interest and what extensions to this method might be possible. This forms a step in my long-term interest to explore how to better ‘complete’ a bifurcation diagram!

Tue, 29 Oct 2019

14:00 - 14:30

Sketching for Linear Least Squares

Zhen Shao

(Oxford)

Abstract

We discuss sketching techniques for sparse Linear Least Squares (LLS) problems, that perform a randomised dimensionality reduction for more efficient and scalable solutions. We give theoretical bounds for the accuracy of the sketched solution/residual when hashing matrices are used for sketching, quantifying carefully the trade-off between the coherence of the original, un-sketched matrix and the sparsity of the hashing matrix. We then use these bounds to quantify the success of our algorithm that employs a sparse factorisation of the sketched matrix as a preconditioner for the original LLS, before applying LSQR. We extensively compare our algorithm to state-of-the-art direct and iterative solvers for large-scale and sparse LLS, with encouraging results.

Tue, 22 Oct 2019

14:30 - 15:00

An optimal polynomial approximation of Brownian motion

James Foster

(Oxford)

Abstract

In this talk, I will present a strong (or pathwise) approximation of standard Brownian motion by a class of orthogonal polynomials. Most notably, the coefficients obtained from this expansion are independent Gaussian random variables. This will enable us to generate approximate Brownian paths by matching certain polynomial moments. To conclude the talk, I will discuss related works and applications to numerical methods for SDEs.

Tue, 22 Oct 2019

14:00 - 14:30

A neural network based policy iteration algorithm with global H^2 -superlinear convergence for stochastic games on domains

Yufei Zhang

(Oxford)

Abstract

In this work, we propose a class of numerical schemes for solving semilinear Hamilton-Jacobi-Bellman-Isaacs (HJBI) boundary value problems which arise naturally from exit time problems of diffusion processes with controlled drift. We exploit policy iteration to reduce the semilinear problem into a sequence of linear Dirichlet problems, which are subsequently approximated by a multilayer feedforward neural network ansatz. We establish that the numerical solutions converge globally in the H^2 -norm, and further demonstrate that this convergence is superlinear, by interpreting the algorithm as an inexact Newton iteration for the HJBI equation. Moreover, we construct the optimal feedback controls from the numerical value functions and deduce convergence. The numerical schemes and convergence results are then extended to oblique derivative boundary conditions. Numerical experiments on the stochastic Zermelo navigation problem and the perpetual American option pricing problems are presented to illustrate the theoretical results and to demonstrate the effectiveness of the method.

Tue, 15 Oct 2019
14:30

Finite Element Methods for Intrinsic Geometric Flows

Evan Gawlik

(University of Hawaii at Manoa)

Abstract

Partial differential equations governing unknown or evolving geometries are ubiquitous in applications and challenging to discretize. A great deal of numerical work in this area has focused on extrinsic geometric flows, where the evolving geometry is a curve or surface embedded in Euclidean space. Much less attention has been paid to the discretization of intrinsic geometric flows, where the evolving geometry is described by a Riemannian metric. This talk will present finite element discretizations of such flows.

Tue, 15 Oct 2019
14:00

Wilkinson, numerical analysis, and me

Nick Trefethen

(Oxford)

Abstract

The two courses I took from Wilkinson as a graduate student at Stanford influenced me greatly. Along with some reminiscences of those days, this talk will touch upon backward error analysis, Gaussian elimination, and Evariste Galois. It was originally presented at the Wilkinson 100th Birthday conference in Manchester earlier this year.

Tue, 08 Oct 2019
14:30

Robust multigrid for linear elasticity and incompressible flow

Florian Wechsung

(Oxford)

Abstract

We study nearly singular PDEs that arise in the solution of linear elasticity and incompressible flow. We will demonstrate, that due to the nearly singular nature, standard methods for the solution of the linear systems arising in a finite element discretisation for these problems fail. We motivate two key ingredients required for a robust multigrid scheme for these equations and construct robust relaxation and prolongation operators for a particular choice of discretisation.

Tue, 08 Oct 2019
14:00

Traces of Class/Cross-Class Structure Pervade Deep Learning Spectra

Vardan Papyan

(Stanford University)

Abstract

Numerous researchers recently applied empirical spectral analysis to the study of modern deep learning classifiers. We identify and discuss an important formal class/cross-class structure and show how it lies at the origin of the many visually striking features observed in deepnet spectra, some of which were reported in recent articles and others unveiled here for the first time. These include spectral outliers and small but distinct bumps often seen beyond the edge of a "main bulk". The structure we identify organizes the coordinates of deepnet features and back-propagated errors, indexing them as an NxC or NxCxC array. Such arrays can be indexed by a two-tuple (i,c) or a three-tuple (i,c,c'), where i runs across the indices of the train set; c runs across the class indices and c' runs across the cross-class indices. This indexing naturally induces C class means, each obtained by averaging over the indices i and c' for a fixed class c. The same indexing also naturally defines C^2 cross-class means, each obtained by averaging over the index i for a fixed class c and a cross-class c'. We develop a formal process of spectral attribution, which is used to show the outliers are attributable to the C class means; the small bump next to the "main bulk" is attributable to between-cross-class covariance; and the "main bulk" is attributable to within-cross-class covariance. Formal theoretical results validate our attribution methodology.
We show how the effects of the class/cross-class structure permeate not only the spectra of deepnet features and backpropagated errors, but also the gradients, Fisher Information matrix and Hessian, whether these are considered in the context of an individual layer or the concatenation of them all. The Kronecker or Khatri-Rao product of the class means in the features and the class/cross-class means in the backpropagated errors approximates the class/cross-class means in the gradients. These means of gradients then create C and C^2 outliers in the spectrum of the Fisher Information matrix, which is the second moment of these gradients. The outliers in the Fisher Information matrix spectrum then create outliers in the Hessian spectrum. We explain the significance of this insight by proposing a correction to KFAC, a well known second-order optimization algorithm for training deepnets.

Tue, 18 Jun 2019

14:30 - 15:00

PathFinder: a toolbox for oscillatory quadrature

Andrew Gibbs

(KU Leuven)

Abstract

Highly oscillatory integrals arise in a range of wave-based problems. For example, they may occur when a basis for a boundary element has been enriched with oscillatory functions, or as part of a localised approximation to various short-wavelength phenomena. A range of contemporary methods exist for the efficient evaluation of such integrals. These methods have been shown to be very effective for model integrals, but may require expertise and manual intervention for
integrals with higher complexity, and can be unstable in practice.

The PathFinder toolbox aims to develop robust and fully automated numerical software for a large class of oscillatory integrals. In this talk I will introduce the method of numerical steepest descent (the technique upon which PathFinder is based) with a few simple examples, which are also intended to highlight potential causes for numerical instability or manual intervention. I will then explain the novel approaches that PathFinder uses to avoid these. Finally I will present some numerical examples, demonstrating how to use the toolbox, convergence results, and an application to the parabolic wave equation.

Tue, 18 Jun 2019

14:00 - 14:30

Improving the scalability of derivative-free optimisation for nonlinear least-squares problems

Lindon Roberts

(Oxford)

Abstract

In existing techniques for model-based derivative-free optimisation, the computational cost of constructing local models and Lagrange polynomials can be high. As a result, these algorithms are not as suitable for large-scale problems as derivative-based methods. In this talk, I will introduce a derivative-free method based on exploration of random subspaces, suitable for nonlinear least-squares problems. This method has a substantially reduced computational cost (in terms of linear algebra), while still making progress using few objective evaluations.

Tue, 11 Jun 2019

14:30 - 15:00

Integrated Approaches for Stochastic Chemical Kinetics

Pamela Burrage

(Queensland)

Abstract

In this talk I discuss how we can simulate stochastic chemical kinetics when there is a memory component. This can occur when there is spatial crowding within a cell or part of a cell, which acts to constrain the motion of the molecules which then in turn changes the dynamics of the chemistry. The counterpart of the Law of Mass Action in this setting is through replacing the first derivative in the ODE description of the Law of Mass Action by a time-fractional derivative, where the time-fractional index is between 0 and 1. There has been much discussion in the literature, some of it wrong, as to how we model and simulate stochastic chemical kinetics in the setting of a spatially-constrained domain – this is sometimes called anomalous diffusion kinetics.

In this presentation, I discuss some of these issues and then present two (equivalent) ways of simulating fractional stochastic chemical kinetics. The key here is to either replace the exponential waiting time used in Gillespie’s SSA by Mittag-Leffler waiting times (MacNamara et al. [2]), which have longer tails than in the exponential case. The other approach is to use some theory developed by Jahnke and Huisinga [1] who are able towrite down the underlying probability density function for any set of mono-molecular chemical reactions (under the standard Law of Mass Action) as a convolution of either binomial probability density functions or binomial and Poisson probability density functions). We can then extend the Jahnke and Huisinga formulation through the concept of iterated Brownian Motion paths to produce exact simulations of the underlying fractional stochastic chemical process. We demonstrate the equivalence of these two approaches through simulations and also by computing the probability density function of the underlying fractional stochastic process, as described by the fractional chemical master equation whose solution is the Mittag-Lefflermatrix function. This is computed based on a clever algorithm for computing matrix functions by Cauchy contours (Weideman and Trefethen [3]).

This is joint work with Manuel Barrio (University of Vallodolid, Spain), Kevin Burrage (QUT), Andre Leier (University of Alabama), Shev MacNamara(University of Technology Sydney)and T. Marquez-Lago (University of Alabama).

[1]T. Jahnke and W. Huisinga, 2007, Solving the chemical master equation for monomolecular reaction systems analytically, J. Math. Biology 54, 1, 1—26.[2]S. MacNamara, B. Henry and W. McLean, 2017, Fractional Euler limits and their applications, SIAM J. Appl. Math. 77, 2, 447—469.[3]J.A.C. Weideman and L.N. Trefethen, 2007, Parabolic and hyperbolic contours for computing the Bromwich integral, Math. Comp. 76, 1341—1356.

Tue, 11 Jun 2019

14:00 - 14:30

The Additive Congruential Random Number (ACORN) Generator - pseudo-random sequences that are well distributed in k-dimensions

Roy S Wikramaratna

(REAMC Limited)

Abstract

ACORN generators represents an approach to generating uniformly distributed pseudo-random numbers which is straightforward to implement for arbitrarily large order $k$ and modulus $M=2^{30t}$ (integer $t$). They give long period sequences which can be proven theoretically to approximate to uniformity in up to $k$ dimensions, while empirical statistical testing demonstrates that (with a few very simple constraints on the choice of parameters and the initialisation) the resulting sequences can be expected to pass all the current standard tests .

The standard TestU01 Crush and BigCrush Statistical Test Suites are used to demonstrate for ACORN generators with order $8≤k≤25$ that the statistical performance improves as the modulus increases from $2^{60}$ to $2^{120}$. With $M=2^{120}$ and $k≥9$, it appears that ACORN generators pass all the current TestU01 tests over a wide range of initialisations; results are presented that demonstrate the remarkable consistency of these results, and explore the limits of this behaviour.

This contrasts with corresponding results obtained for the widely-used Mersenne Twister MT19937 generator, which consistently failed on two of the tests in both the Crush and BigCrush test suites.

There are other pseudo-random number generators available which will also pass all the TestU01 tests. However, for the ACORN generators it is possible to go further: we assert that an ACORN generator might also be expected to pass any more demanding tests for $p$-dimensional uniformity that may be required in the future, simply by choosing the order $k>p$, the modulus $M=2^{30t}$ for sufficiently large $t$, together with any odd value for the seed and an arbitrary set of initial values. We note that there will be $M/2$ possible odd values for the seed, with each such choice of seed giving rise to a different $k$-th order ACORN sequence satisfying all the required tests.

This talk builds on and extends results presented at the recent discussion meeting on “Numerical algorithms for high-performance computational science” at the Royal Society London, 8-9 April 2019, see download link at bottom of web page http://acorn.wikramaratna.org/references.html.

Tue, 04 Jun 2019

14:30 - 15:00

The dual approach to non-negative super-resolution: impact on primal reconstruction accuracy

Bogdan Toader

(Oxford)

Abstract

We study the problem of super-resolution using TV norm minimisation, where we recover the locations and weights of non-negative point sources from a few samples of their convolution with a Gaussian kernel. A practical approach is to solve the dual problem. In this talk, we study the stability of solutions with respect to the solutions to the dual problem. In particular, we establish a relationship between perturbations in the dual variable and the primal variables around the optimiser. This is achieved by applying a quantitative version of the implicit function theorem in a non-trivial way.

Tue, 04 Jun 2019

14:00 - 14:30

Decentralised Sparse Multi-Task Regression

Dominic Richards

(Oxford)

Abstract

We consider a sparse multi-task regression framework for fitting a collection of related sparse models. Representing models as nodes in a graph with edges between related models, a framework that fuses lasso regressions with the total variation penalty is investigated. Under a form of generalised restricted eigenvalue assumption, bounds on prediction and squared error are given that depend upon the sparsity of each model and the differences between related models. This assumption relates to the smallest eigenvalue restricted to the intersection of two cone sets of the covariance matrix constructed from each of the agents' covariances. In the case of a grid topology high-probability bounds are given that match, up to log factors, the no-communication setting of fitting a lasso on each model, divided by the number of agents. A decentralised dual method that exploits a convex-concave formulation of the penalised problem is proposed to fit the models and its effectiveness demonstrated on simulations. (Joint work with Sahand Negahban and Patrick Rebeschini)