Forthcoming events in this series


Thu, 04 Apr 2024

16:00 - 17:00
Virtual

Differential Equation-inspired Deep Learning for Node Classification and Spatiotemporal Forecasting

Noseong Park
Further Information
Abstract

Scientific knowledge, written in the form of differential equations, plays a vital role in various deep learning fields. In this talk, I will present a graph neural network (GNN) design based on reaction-diffusion equations, which addresses the notorious oversmoothing problem of GNNs. Since the self-attention of Transformers can also be viewed as a special case of graph processing, I will present how we can enhance Transformers in a similar way. I will also introduce a spatiotemporal forecasting model based on neural controlled differential equations (NCDEs). NCDEs were designed to process irregular time series in a continuous manner and for spatiotemporal processing, it needs to be combined with a spatial processing module, i.e., GNN. I will show how this can be done. 

Thu, 21 Mar 2024

16:00 - 17:00
Virtual

Data-driven surrogate modelling for astrophysical simulations: from stellar winds to supernovae

Jeremy Yates and Frederik De Ceuster
(University College London)
Further Information
Abstract

The feedback loop between simulations and observations is the driving force behind almost all discoveries in astronomy. However, as technological innovations allow us to create ever more complex simulations and make ever more detailed observations, it becomes increasingly difficult to combine the two: since we cannot do controlled experiments, we need to simulate whatever we can observe. This requires efficient simulation pipelines, including (general-relativistic-)(magneto-)hydrodynamics, particle physics, chemistry, and radiation transport. In this talk, we explore the challenges associated with these modelling efforts and discuss how adopting data-driven surrogate modelling and proper control over model uncertainties, promises to unlock a gold mine of future discoveries. For instance, the application to stellar wind simulations can teach us about the origin of chemistry in our Universe and the building blocks for life, while supernova simulations can reveal exotic states of matter and elucidate the formation black holes.

Thu, 15 Feb 2024

16:00 - 17:00
Virtual

From Lévy's stochastic area formula to universality of affine and polynomial processes via signature SDEs

Christa Cuchiero
(University of Vienna)
Further Information
Abstract

A plethora of stochastic models used in particular in mathematical finance, but also population genetics and physics, stems from the class of affine and polynomial processes. The history of these processes is on the one hand closely connected with the important concept of tractability, that is a substantial reduction of computational efforts due to special structural features, and on the other hand with a unifying framework for a large number of probabilistic models. One early instance in the literature where this unifying affine and polynomial point of view can be applied is Lévy's stochastic area formula. Starting from this example,  we present a guided tour through the main properties and recent results, which lead to signature stochastic differential equations (SDEs). They constitute a large class of stochastic processes, here driven by Brownian motions, whose characteristics are entire or real-analytic functions of their own signature, i.e. of iterated integrals of the process with itself, and allow therefore for a generic path dependence. We show that their prolongation with the corresponding signature is an affine and polynomial process taking values in subsets of group-like elements of the extended tensor algebra. Signature SDEs are thus a class of stochastic processes, which is universal within Itô processes with path-dependent characteristics and which allows - due to the affine theory - for a relatively explicit characterization of the Fourier-Laplace transform and hence the full law on path space.

Thu, 25 Jan 2024

16:00 - 17:00
Virtual

An Approximation Theory for Metric Space-Valued Functions With A View Towards Deep Learning

Anastasis Kratsios
Further Information
Abstract

We build universal approximators of continuous maps between arbitrary Polish metric spaces X and Y using universal approximators between Euclidean spaces as building blocks. Earlier results assume that the output space Y is a topological vector space. We overcome this limitation by "randomization": our approximators output discrete probability measures over Y. When X and Y are Polish without additional structure, we prove very general qualitative guarantees; when they have suitable combinatorial structure, we prove quantitative guarantees for Hölder-like maps, including maps between finite graphs, solution operators to rough differential equations between certain Carnot groups, and continuous non-linear operators between Banach spaces arising in inverse problems. In particular, we show that the required number of Dirac measures is determined by the combinatorial structure of X and Y. For barycentric Y, including Banach spaces, R-trees, Hadamard manifolds, or Wasserstein spaces on Polish metric spaces, our approximators reduce to Y-valued functions. When the Euclidean approximators are neural networks, our constructions generalize transformer networks, providing a new probabilistic viewpoint of geometric deep learning. 

As an application, we show that the solution operator to an RDE can be approximated within our framework.

Based on the following articles: 

         An Approximation Theory for Metric Space-Valued Functions With A View Towards Deep Learning (2023) - Chong Liu, Matti Lassas, Maarten V. de Hoop, and Ivan Dokmanić (ArXiV 2304.12231)

         Designing universal causal deep learning models: The geometric (Hyper)transformer (2023) B. Acciaio, A. Kratsios, and G. Pammer, Math. Fin. https://onlinelibrary.wiley.com/doi/full/10.1111/mafi.12389

         Universal Approximation Under Constraints is Possible with Transformers (2022) - ICLR Spotlight - A. Kratsios, B. Zamanlooy, T. Liu, and I. Dokmanić.

 

Thu, 01 Dec 2022
16:00
Virtual

Particle filters for Data Assimilation

Dan Crisan
(Imperial College London)

Note: we would recommend to join the meeting using the Teams client for best user experience.

Further Information
Abstract

Modern Data Assimilation (DA) can be traced back to the sixties and owes a lot to earlier developments in linear filtering theory. Since then, DA has evolved independently of Filtering Theory. To-date it is a massively important area of research due to its many applications in meteorology, ocean prediction, hydrology, oil reservoir exploration, etc. The field has been largely driven by practitioners, however in recent years an increasing body of theoretical work has been devoted to it. In this talk, In my talk, I will advocate the interpretation of DA through the language of stochastic filtering. This interpretation allows us to make use of advanced particle filters to produce rigorously validated DA methodologies. I will present a particle filter that incorporates three additional add-on procedures: nudging, tempering and jittering. The particle filter is tested on a two-layer quasi-geostrophic model with O(10^6) degrees of freedom out of which only a minute fraction are noisily observed.

Thu, 24 Nov 2022
16:00
Virtual

The Legendre Memory Unit: A neural network with optimal time series compression

Chris Eliasmith
(University of Waterloo)

Note: we would recommend to join the meeting using the Teams client for best user experience.

Further Information
Abstract

We have recently proposed a new kind of neural network, called a Legendre Memory Unit (LMU) that is provably optimal for compressing streaming time series data. In this talk, I describe this network, and a variety of state-of-the-art results that have been set using the LMU. I will include recent results on speech and language applications that demonstrate significant improvements over transformers. I will discuss variants of the original LMU that permit effective scaling on current GPUs and hold promise to provide extremely efficient edge time series processing.

Thu, 03 Nov 2022
16:00
Virtual

Signatures and Functional Expansions

Bruno Dupire
(Bloomberg)

Note: we would recommend to join the meeting using the Teams client for best user experience.

Further Information
Abstract

European option payoffs can be generated by combinations of hockeystick payoffs or of monomials. Interestingly, path dependent options can be generated by combinations of signatures, which are the building blocks of path dependence. We focus on the case of 1 asset together with time, typically the evolution of the price x as a function of the time t. The signature of a path for a given word with letters in the alphabet {t,x} (sometimes called augmented signature of dimension 1) is an iterated Stratonovich integral with respect to the letters of the word and it plays the role of a monomial in a Taylor expansion. For a given time horizon T the signature elements associated to short words are contained in the linear space generated by the signature elements associated to longer words and we construct an incremental basis of signature elements. It allows writing a smooth path dependent payoff as a converging series of signature elements, a result stronger than the density property of signature elements from the Stone-Weierstrass theorem. We recall the main concepts of the Functional Itô Calculus, a natural framework to model path dependence and draw links between two approximation results, the Taylor expansion and the Wiener chaos decomposition. The Taylor expansion is obtained by iterating the functional Stratonovich formula whilst the Wiener chaos decomposition is obtained by iterating the functional Itô formula applied to a conditional expectation. We also establish the pathwise Intrinsic Expansion and link it to the Functional Taylor Expansion.

Wed, 29 Jun 2022

16:00 - 17:00

Information theory with kernel methods

Francis Bach
(INRIA - Ecole Normale Supérieure)
Further Information
Abstract

I will consider the analysis of probability distributions through their associated covariance operators from reproducing kernel Hilbert spaces. In this talk, I will show that the von Neumann entropy and relative entropy of these operators are intimately related to the usual notions of Shannon entropy and relative entropy, and share many of their properties. They come together with efficient estimation algorithms from various oracles on the probability distributions. I will also present how these new notions of relative entropy lead to new upper-bounds on log partition functions, that can be used together with convex optimization within variational inference methods, providing a new family of probabilistic inference methods (based on https://arxiv.org/pdf/2202.08545.pdf, see also https://francisbach.com/information-theory-with-kernel-methods/).

Thu, 26 May 2022

16:00 - 17:00
Virtual

Tensor Product Kernels for Independence

Zoltan Szabo
(London School of Economics)
Further Information
Abstract

Hilbert-Schmidt independence criterion (HSIC) is among the most widely-used approaches in machine learning and statistics to measure the independence of random variables. Despite its popularity and success in numerous applications, quite little is known about when HSIC characterizes independence. I am going to provide a complete answer to this question, with conditions which are often easy to verify in practice.

This talk is based on joint work with Bharath Sriperumbudur.

Wed, 20 Apr 2022

09:00 - 10:00
Virtual

Optimization, Speed-up, and Out-of-distribution Prediction in Deep Learning

Wei Chen
(Chinese Academy of Sciences)
Further Information
Abstract

In this talk, I will introduce our investigations on how to make deep learning easier to optimize, faster to train, and more robust to out-of-distribution prediction. To be specific, we design a group-invariant optimization framework for ReLU neural networks; we compensate the gradient delay in asynchronized distributed training; and we improve the out-of-distribution prediction by incorporating “causal” invariance.

Thu, 24 Mar 2022

16:00 - 17:00
Virtual

The Geometry of Linear Convolutional Networks

Kathlén Kohn
(KTH Royal Institute of Technology)
Further Information
Abstract

We discuss linear convolutional neural networks (LCNs) and their critical points. We observe that the function space (that is, the set of functions represented by LCNs) can be identified with polynomials that admit certain factorizations, and we use this perspective to describe the impact of the network's architecture on the geometry of the function space.

For instance, for LCNs with one-dimensional convolutions having stride one and arbitrary filter sizes, we provide a full description of the boundary of the function space. We further study the optimization of an objective function over such LCNs: We characterize the relations between critical points in function space and in parameter space and show that there do exist spurious critical points. We compute an upper bound on the number of critical points in function space using Euclidean distance degrees and describe dynamical invariants for gradient descent.

This talk is based on joint work with Thomas Merkh, Guido Montúfar, and Matthew Trager.

Thu, 10 Feb 2022

16:00 - 17:00
Virtual

Non-Parametric Estimation of Manifolds from Noisy Data

Yariv Aizenbud
(Yale University)
Further Information
Abstract

In many data-driven applications, the data follows some geometric structure, and the goal is to recover this structure. In many cases, the observed data is noisy and the recovery task is even more challenging. A common assumption is that the data lies on a low dimensional manifold. Estimating a manifold from noisy samples has proven to be a challenging task. Indeed, even after decades of research, there was no (computationally tractable) algorithm that accurately estimates a manifold from noisy samples with a constant level of noise.

In this talk, we will present a method that estimates a manifold and its tangent. Moreover, we establish convergence rates, which are essentially as good as existing convergence rates for function estimation.

This is a joint work with Barak Sober.

Thu, 03 Feb 2022

16:00 - 17:00
Virtual

Optimal Thinning of MCMC Output

Chris Oates
(Newcastle University)
Further Information
Abstract

The use of heuristics to assess the convergence and compress the output of Markov chain Monte Carlo can be sub-optimal in terms of the empirical approximations that are produced. Here we consider the problem of retrospectively selecting a subset of states, of fixed cardinality, from the sample path such that the approximation provided by their empirical distribution is close to optimal. A novel method is proposed, based on greedy minimisation of a kernel Stein discrepancy, that is suitable for problems where heavy compression is required. Theoretical results guarantee consistency of the method and its effectiveness is demonstrated in the challenging context of parameter inference for ordinary differential equations. Software is available in the Stein Thinning package in Python, R and MATLAB.

Thu, 27 Jan 2022

16:00 - 17:00
Virtual

Learning Homogenized PDEs in Continuum Mechanics

Andrew Stuart
(Caltech)
Further Information
Abstract

Neural networks have shown great success at learning function approximators between spaces X and Y, in the setting where X is a finite dimensional Euclidean space and where Y is either a finite dimensional Euclidean space (regression) or a set of finite cardinality (classification); the neural networks learn the approximator from N data pairs {x_n, y_n}. In many problems arising in the physical and engineering sciences it is desirable to generalize this setting to learn operators between spaces of functions X and Y. The talk will overview recent work in this context.

Then the talk will focus on work aimed at addressing the problem of learning operators which define the constitutive model characterizing the macroscopic behaviour of multiscale materials arising in material modeling. Mathematically this corresponds to using machine learning to determine appropriate homogenized equations, using data generated at the microscopic scale. Applications to visco-elasticity and crystal-plasticity are given.

Thu, 13 Jan 2022

16:00 - 17:00
Virtual

Regularity structures and machine learning

Ilya Chevyrev
(Edinburgh University)
Further Information
Abstract

In many machine learning tasks, it is crucial to extract low-dimensional and descriptive features from a data set. In this talk, I present a method to extract features from multi-dimensional space-time signals which is motivated, on the one hand, by the success of path signatures in machine learning, and on the other hand, by the success of models from the theory of regularity structures in the analysis of PDEs. I will present a flexible definition of a model feature vector along with numerical experiments in which we combine these features with basic supervised linear regression to predict solutions to parabolic and dispersive PDEs with a given forcing and boundary conditions. Interestingly, in the dispersive case, the prediction power relies heavily on whether the boundary conditions are appropriately included in the model. The talk is based on the following joint work with Andris Gerasimovics and Hendrik Weber: https://arxiv.org/abs/2108.05879

Wed, 12 Jan 2022

09:00 - 10:00
Virtual

Learning and Learning to Solve PDEs

Bin Dong
(Peking University)
Further Information
Abstract

Deep learning continues to dominate machine learning and has been successful in computer vision, natural language processing, etc. Its impact has now expanded to many research areas in science and engineering. In this talk, I will mainly focus on some recent impacts of deep learning on computational mathematics. I will present our recent work on bridging deep neural networks with numerical differential equations, and how it may guide us in designing new models and algorithms for some scientific computing tasks. On the one hand, I will present some of our works on the design of interpretable data-driven models for system identification and model reduction. On the other hand, I will present our recent attempts at combining wisdom from numerical PDEs and machine learning to design data-driven solvers for PDEs and their applications in electromagnetic simulation.

Thu, 14 Oct 2021

16:00 - 17:00
Virtual

Kernel-based Statistical Methods for Functional Data

George Wynne
(Imperial College London)
Further Information

ww.datasig.ac.uk/events

Abstract

Kernel-based statistical algorithms have found wide success in statistical machine learning in the past ten years as a non-parametric, easily computable engine for reasoning with probability measures. The main idea is to use a kernel to facilitate a mapping of probability measures, the objects of interest, into well-behaved spaces where calculations can be carried out. This methodology has found wide application, for example two-sample testing, independence testing, goodness-of-fit testing, parameter inference and MCMC thinning. Most theoretical investigations and practical applications have focused on Euclidean data. This talk will outline work that adapts the kernel-based methodology to data in an arbitrary Hilbert space which then opens the door to applications for functional data, where a single data sample is a discretely observed function, for example time series or random surfaces. Such data is becoming increasingly more prominent within the statistical community and in machine learning. Emphasis shall be given to the two-sample and goodness-of-fit testing problems.

Wed, 22 Sep 2021

09:00 - 10:00
Virtual

Stochastic Flows and Rough Differential Equations on Foliated Spaces

Yuzuru Inahama
(Kyushu University)
Further Information
Abstract

Stochastic differential equations (SDEs) on compact foliated spaces were introduced a few years ago. As a corollary, a leafwise Brownian motion on a compact foliated space was obtained as a solution to an SDE. In this work we construct stochastic flows associated with the SDEs by using rough path theory, which is something like a 'deterministic version' of Ito's SDE theory.

This is joint work with Kiyotaka Suzaki.

Wed, 08 Sep 2021

09:00 - 10:00
Virtual

Co-clustering Analysis of Multidimensional Big Data

Hong Yan
(City University of Hong Kong)
Further Information
Abstract

Although a multidimensional data array can be very large, it may contain coherence patterns much smaller in size. For example, we may need to detect a subset of genes that co-express under a subset of conditions. In this presentation, we discuss our recently developed co-clustering algorithms for the extraction and analysis of coherent patterns in big datasets. In our method, a co-cluster, corresponding to a coherent pattern, is represented as a low-rank tensor and it can be detected from the intersection of hyperplanes in a high dimensional data space. Our method has been used successfully for DNA and protein data analysis, disease diagnosis, drug therapeutic effect assessment, and feature selection in human facial expression classification. Our method can also be useful for many other real-world data mining, image processing and pattern recognition applications.

Thu, 10 Jun 2021

16:00 - 17:00
Virtual

Refining Data-Driven Market Simulators and Managing their Risks

Blanka Horvath
(King's College London)
Further Information
Abstract

Techniques that address sequential data have been a central theme in machine learning research in the past years. More recently, such considerations have entered the field of finance-related ML applications in several areas where we face inherently path dependent problems: from (deep) pricing and hedging (of path-dependent options) to generative modelling of synthetic market data, which we refer to as market generation.

We revisit Deep Hedging from the perspective of the role of the data streams used for training and highlight how this perspective motivates the use of highly-accurate generative models for synthetic data generation. From this, we draw conclusions regarding the implications for risk management and model governance of these applications, in contrast to risk management in classical quantitative finance approaches.

Indeed, financial ML applications and their risk management heavily rely on a solid means of measuring and efficiently computing (similarity-)metrics between datasets consisting of sample paths of stochastic processes. Stochastic processes are at their core random variables with values on path space. However, while the distance between two (finite dimensional) distributions was historically well understood, the extension of this notion to the level of stochastic processes remained a challenge until recently. We discuss the effect of different choices of such metrics while revisiting some topics that are central to ML-augmented quantitative finance applications (such as the synthetic generation and the evaluation of similarity of data streams) from a regulatory (and model governance) perspective. Finally, we discuss the effect of considering refined metrics which respect and preserve the information structure (the filtration) of the market and the implications and relevance of such metrics on financial results.

Thu, 03 Jun 2021

16:00 - 17:00
Virtual

Kinetic Brownian motion in the diffeomorphism group of a closed Riemannian manifold

Ismaël Bailleul
(Université de Rennes)
Further Information
Abstract

In its simplest instance, kinetic Brownian in Rd is a C1 random path (mt, vt) with unit velocity vt a Brownian motion on the unit sphere run at speed a > 0. Properly time rescaled as a function of the parameter a, its position process converges to a Brownian motion in Rd as a tends to infinity. On the other side the motion converges to the straight line motion (= geodesic motion) when a goes to 0. Kinetic Brownian motion provides thus an interpolation between geodesic and Brownian flows in this setting. Think now about changing Rd for the diffeomorphism group of a fluid domain, with a velocity vector now a vector field on the domain. I will explain how one can prove in this setting an interpolation result similar to the previous one, giving an interpolation between Euler’s equations of incompressible flows and a Brownian-like flow on the diffeomorphism group.

Thu, 13 May 2021

16:00 - 17:00
Virtual

High-dimensional, multiscale online changepoint detection

Richard Samworth
(DPMMS University of Cambridge)
Further Information
Abstract

We introduce a new method for high-dimensional, online changepoint detection in settings where a $p$-variate Gaussian data stream may undergo a change in mean. The procedure works by performing likelihood ratio tests against simple alternatives of different scales in each coordinate, and then aggregating test statistics across scales and coordinates. The algorithm is online in the sense that both its storage requirements and worst-case computational complexity per new observation are independent of the number of previous observations. We prove that the patience, or average run length under the null, of our procedure is at least at the desired nominal level, and provide guarantees on its response delay under the alternative that depend on the sparsity of the vector of mean change. Simulations confirm the practical effectiveness of our proposal, which is implemented in the R package 'ocd', and we also demonstrate its utility on a seismology data set.

Thu, 06 May 2021

16:00 - 17:00
Virtual

New perspectives on rough paths, signatures and signature cumulants

Peter K Friz
(Berlin University of Technology)
Further Information
Abstract

We revisit rough paths and signatures from a geometric and "smooth model" perspective. This provides a lean framework to understand and formulate key concepts of the theory, including recent insights on higher-order translation, also known as renormalization of rough paths. This first part is joint work with C Bellingeri (TU Berlin), and S Paycha (U Potsdam). In a second part, we take a semimartingale perspective and more specifically analyze the structure of expected signatures when written in exponential form. Following Bonnier-Oberhauser (2020), we call the resulting objects signature cumulants. These can be described - and recursively computed - in a way that can be seen as unification of previously unrelated pieces of mathematics, including Magnus (1954), Lyons-Ni (2015), Gatheral and coworkers (2017 onwards) and Lacoin-Rhodes-Vargas (2019). This is joint work with P Hager and N Tapia.

Thu, 29 Apr 2021

16:00 - 17:00
Virtual

Nonlinear Independent Component Analysis: Identifiability, Self-Supervised Learning, and Likelihood

Aapo Hyvärinen
(University of Helsinki)
Further Information
Abstract

Unsupervised learning, in particular learning general nonlinear representations, is one of the deepest problems in machine learning. Estimating latent quantities in a generative model provides a principled framework, and has been successfully used in the linear case, especially in the form of independent component analysis (ICA). However, extending ICA to the nonlinear case has proven to be extremely difficult: A straight-forward extension is unidentifiable, i.e. it is not possible to recover those latent components that actually generated the data. Recently, we have shown that this problem can be solved by using additional information, in particular in the form of temporal structure or some additional observed variable. Our methods were originally based on "self-supervised" learning increasingly used in deep learning, but in more recent work, we have provided likelihood-based approaches. In particular, we have developed computational methods for efficient maximization of the likelihood for two variants of the model, based on variational inference or Riemannian relative gradients, respectively.

Wed, 21 Apr 2021
09:00
Virtual

Learning developmental path signature features with deep learning framework for infant cognitive scores prediction

Xin Zhang
(South China University of Technology)
Further Information
Abstract

Path signature has unique advantages on extracting high-order differential features of sequential data. Our team has been studying the path signature theory and actively applied it to various applications, including infant cognitive score prediction, human motion recognition, hand-written character recognition, hand-written text line recognition and writer identification etc. In this talk, I will share our most recent works on infant cognitive score prediction using deep path signature. The cognitive score can reveal individual’s abilities on intelligence, motion, language abilities. Recent research discovered that the cognitive ability is closely related with individual’s cortical structure and its development. We have proposed two frameworks to predict the cognitive score with different path signature features. For the first framework, we construct the temporal path signature along the age growth and extract signature features of developmental infant cortical features. By incorporating the cortical path signature into the multi-stream deep learning model, the individual cognitive score can be predicted with missing data issues. For the second framework, we propose deep path signature algorithm to compute the developmental feature and obtain the developmental connectivity matrix. Then we have designed the graph convolutional network for the score prediction. These two frameworks have been tested on two in-house cognitive data sets and reached the state-of-the-art results.