Forthcoming events in this series

Thu, 05 Dec 2024

Mean Field Games in a Stackelberg problem with an informed major player

Dr Philippe Bergault
(Université Paris Dauphine-PSL)
Further Information

Please join us for refreshments outside the lecture room from 15:30.


We investigate a stochastic differential game in which a major player has a private information (the knowledge of a random variable), which she discloses through her control to a population of small players playing in a Nash Mean Field Game equilibrium. The major player’s cost depends on the distribution of the population, while the cost of the population depends on the random variable known by the major player. We show that the game has a relaxed solution and that the optimal control of the major player is approximatively optimal in games with a large but finite number of small players. Joint work with Pierre Cardaliaguet and Catherine Rainer.

Thu, 28 Nov 2024

Regurgitative Training in Finance: Generative Models for Portfolios

Adil Rengim Cetingoz
(Centre d'Economie de la Sorbonne)
Further Information

Please join us for refreshments outside the lecture room from 15:30.

Simulation methods have always been instrumental in finance, but data-driven methods with minimal model specification, commonly referred to as generative models, have attracted increasing attention, especially after the success of deep learning in a broad range of fields. However, the adoption of these models in practice has not kept pace with the growing interest, probably due to the unique complexities and challenges of financial markets. This paper aims to contribute to a deeper understanding of the development, use and evaluation of generative models, particularly in portfolio and risk management. To this end, we begin by presenting theoretical results on the importance of initial sample size, and point out the potential pitfalls of generating far more data than originally available. We then highlight the inseparable nature of model development and the desired use case by touching on a very interesting paradox: that generic generative models inherently care less about what is important for constructing portfolios (at least the interesting ones, i.e. long-short). Based on these findings, we propose a pipeline for the generation of multivariate returns that meets conventional evaluation standards on a large universe of US equities while providing interesting insights into the stylized facts observed in asset returns and how a few statistical factors are responsible for their existence. Recognizing the need for more delicate evaluation methods, we suggest, through an example of mean-reversion strategies, a method designed to identify bad models for a given application based on regurgitative training, retraining the model using the data it has itself generated.

Thu, 14 Nov 2024

Higher-order approximation of jump-diffusion McKean--Vlasov SDEs

Dr Verena Schwarz
(University of Klagenfurt)
Further Information

Please join us for refreshments outside the lecture room from 15:30.



In this talk we study the numerical approximation of the jump-diffusion McKean--Vlasov SDEs with super-linearly growing drift, diffusion and jump-coefficient. In the first step, we derive the corresponding interacting particle system and define a Milstein-type approximation for this. Making use of the propagation of chaos result and investigating the error of the Milstein-type scheme we provide convergence results for the scheme. In a second step, we discuss potential simplifications of the numerical approximation scheme for the direct approximation of the jump-diffusion McKean--Vlasov SDE. Lastly, we present the results of our numerical simulations.

Thu, 07 Nov 2024

Continuous-time persuasion by filtering

Dr Ofelia Bonesini
Further Information

Please join us for refreshments outside the lecture room from 15:30.


We frame dynamic persuasion in a partial observation stochastic control game with an ergodic criterion. The receiver controls the dynamics of a multidimensional unobserved state process. Information is provided to the receiver through a device designed by the sender that generates the observation process. 

The commitment of the sender is enforced and an exogenous information process outside the control of the sender is allowed. We develop this approach in the case where all dynamics are linear and the preferences of the receiver are linear-quadratic.

We prove a verification theorem for the existence and uniqueness of the solution of the HJB equation satisfied by the receiver’s value function. An extension to the case of persuasion of a mean field of interacting receivers is also provided. We illustrate this approach in two applications: the provision of information to electricity consumers with a smart meter designed by an electricity producer; the information provided by carbon footprint accounting rules to companies engaged in a best-in-class emissions reduction effort. In the first application, we link the benefits of information provision to the mispricing of electricity production. In the latter, we show that when firms declare a high level of best-in-class target, the information provided by stringent accounting rules offsets the Nash equilibrium effect that leads firms to increase pollution to make their target easier to achieve.

This is a joint work with Prof. René Aïd, Prof. Giorgia Callegaro and Prof. Luciano Campi.

Thu, 31 Oct 2024

Re(Visiting) Large Language Models in Finance

Eghbal Rahimikia
(University of Manchester)

This study introduces a novel suite of historical large language models (LLMs) pre-trained specifically for accounting and finance, utilising a diverse set of major textual resources. The models are unique in that they are year-specific, spanning from 2007 to 2023, effectively eliminating look-ahead bias, a limitation present in other LLMs. Empirical analysis reveals that, in trading, these specialised models outperform much larger models, including the state-of-the-art LLaMA 1, 2, and 3, which are approximately 50 times their size. The findings are further validated through a range of robustness checks, confirming the superior performance of these LLMs.

Thu, 24 Oct 2024
Citi Stirling Square, London, SW1Y 5AD

Backtesting with correlated data

Nikolai Nowaczyk
(NatWest Group)

The important problem of backtesting financial models over long horizons inevitably leads to overlapping returns, giving rise to correlated samples. We propose a new method of dealing with this problem by decorrelation and show how this increases the discriminatory power of the resulting tests.

About the speaker
Nikolai Nowaczyk is a Risk Management & AI consultant who has advised multiple institutional clients in  projects around counterparty credit risk and xVA as well as data science and machine learning. 
Nikolai holds a PhD in mathematics from the University of Regensburg and has been an Academic Visitor at Imperial College London.

Registration for in-person attendance is required in advance.

Register here.

Thu, 17 Oct 2024

Risk, utility and sensitivity to large losses

Dr Nazem Khan
(Mathematical Institute)
Further Information

Please join us for refreshments outside the lecture room from 15:30.

Risk and utility functionals are fundamental building blocks in economics and finance. In this paper we investigate under which conditions a risk or utility functional is sensitive to the accumulation of losses in the sense that any sufficiently large multiple of a position that exposes an agent to future losses has positive risk or negative utility. We call this property sensitivity to large losses and provide necessary and sufficient conditions thereof that are easy to check for a very large class of risk and utility functionals. In particular, our results do not rely on convexity and can therefore also be applied to most examples discussed in the recent literature, including (non-convex) star-shaped risk measures or S-shaped utility functions encountered in prospect theory. As expected, Value at Risk generally fails to be sensitive to large losses. More surprisingly, this is also true of Expected Shortfall. By contrast, expected utility functionals as well as (optimized) certainty equivalents are proved to be sensitive to large losses for many standard choices of concave and nonconcave utility functions, including S-shaped utility functions. We also show that Value at Risk and Expected Shortfall become sensitive to large losses if they are either properly adjusted or if the property is suitably localized.

Thu, 13 Jun 2024

Path-dependent optimal transport and applications

Dr Ivan Guo
(Monash University, Melbourne)
Further Information

Please join us for reshments outside the lecture room from 1530.


We extend stochastic optimal transport to path-dependent settings. The problem is to find a semimartingale measure that satisfies general path-dependent constraints, while minimising a cost function on the drift and diffusion coefficients. Duality is established and expressed via non-linear path-dependent partial differential equations (PPDEs). The technique has applications in volatility calibration, including the calibration of path-dependent derivatives, LSV models, and joint SPX-VIX models. It produces a non-parametric volatility model that localises to the features of the derivatives. Another application is in the robust pricing and hedging of American options in continuous time. This is achieved by establishing duality in a space enlarged by the stopping decisions, and showing that the extremal points of martingale measures on the enlarged space are in fact martingale measures on the original space coupled with stopping times.

Thu, 06 Jun 2024
33 Canada Square, Canary Wharf, E14 5LB

Frontiers in Quantitative Finance: Professor Steve Heston: Model-free Hedging of Option Variance and Skewness

Professor Steven Heston
(University of Maryland)
Further Information

Please register via our TicketSource page.


Frontiers in Quantitative Finance is brought to you by the Oxford Mathematical and Computational Finance Group and sponsored by CitiGroup and Mosaic SmartData.

This paper parsimoniously generalizes the VIX variance index by constructing model-free factor portfolios that replicate skewness and higher moments. It then develops an infinite series to replicate option payoffs in terms of the stock, bond, and factor returns. The truncated series offers new formulas that generalize the Black-Scholes formula to hedge variance and skewness risk.

About the speaker
Steve Heston is Professor of Finance at the University of Maryland. He is known for his pioneering work on the pricing of options with stochastic volatility.
Steve graduated with a double major in Mathematics and Economics from the University of Maryland, College Park in 1983, an MBA in 1985 followed by a PhD in Finance in 1990. He has held previous faculty positions at Yale, Columbia, Washington University, and the University of Auckland in New Zealand and worked in the private sector with Goldman Sachs in Fixed Income Arbitrage and in Asset Management Quantitative Equities.

Thu, 30 May 2024

Hawkes-based microstructure of rough volatility model with sharp rise

Rouyi Zhang
(HU Berlin)
Further Information

Please join us for refreshments outside the lecture room from 1530.

We consider the microstructure of a stochastic volatility model incorporating both market and limit orders. In our model, the volatility is driven by self-exciting arrivals of market orders as well as self-exciting arrivals of limit orders, which are modeled by Hawkes processes. The impact of market order on future order arrivals is captured by a Hawkes kernel with power law decay, and is hence persistent. The impact of limit orders on future order arrivals is temporary, yet possibly long-lived. After suitable scaling the volatility process converges to a fractional Heston model driven by an additional Poisson random measure. The random measure generates occasional spikes in the volatility process. The spikes resemble the clustering of small jumps in the volatility process that has been frequently observed in the financial economics literature. Our results are based on novel uniqueness results for stochastic Volterra equations driven by a Poisson random measure and non-linear fractional Volterra equations.


Thu, 16 May 2024
Stirling Square, London, SW1Y 5AD

Frontiers in Quantitative Finance Seminar: Turning tail risks into tail winds: using information geometry for portfolio optimisation

Julien Turc
(BNP Paribas)
Further Information

Registration for the talk is free but required.

Register here.


A wide variety of solutions have been proposed in order to cope with the deficiencies of Modern Portfolio Theory. The ideal portfolio should optimise the investor’s expected utility. Robustness can be achieved by ensuring that the optimal portfolio does not diverge too much from a predetermined allocation. Information geometry proposes interesting and relatively simple ways to model divergence. These techniques can be applied to the risk budgeting framework in order to extend risk budgeting and to unify various classical approaches in a single, parametric framework. By switching from entropy to divergence functions, the entropy-based techniques that are useful for risk budgeting can be applied to more traditional, constrained portfolio allocation. Using these divergence functions opens new opportunities for portfolio risk managers. This presentation is based on two papers published by the BNP Paribas QIS Lab, `The properties of alpha risk parity’ (2022, Entropy) and `Turning tail risks into tailwinds’ (2020, The Journal of Portfolio Management).

Thu, 09 May 2024

Signature Trading: A Path-Dependent Extension of the Mean-Variance Framework with Exogenous Signals

Owen Futter
(Mathematical Institute)
Further Information

Please join us for reshments outside the lecture room from 1530.


In this seminar we introduce a portfolio optimisation framework, in which the use of rough path signatures (Lyons, 1998) provides a novel method of incorporating path-dependencies in the joint signal-asset dynamics, naturally extending traditional factor models, while keeping the resulting formulas lightweight, tractable and easily interpretable. Specifically, we achieve this by representing a trading strategy as a linear functional applied to the signature of a path (which we refer to as “Signature Trading” or “Sig-Trading”). This allows the modeller to efficiently encode the evolution of past time-series observations into the optimisation problem. In particular, we derive a concise formulation of the dynamic mean-variance criterion alongside an explicit solution in our setting, which naturally incorporates a drawdown control in the optimal strategy over a finite time horizon. Secondly, we draw parallels between classical portfolio stategies and Sig-Trading strategies and explain how the latter leads to a pathwise extension of the classical setting via the “Signature Efficient Frontier”. Finally, we give explicit examples when trading under an exogenous signal as well as examples for momentum and pair-trading strategies, demonstrated both on synthetic and market data. Our framework combines the best of both worlds between classical theory (whose appeal lies in clear and concise formulae) and between modern, flexible data-driven methods (usually represented by ML approaches) that can handle more realistic datasets. The advantage of the added flexibility of the latter is that one can bypass common issues such as the accumulation of heteroskedastic and asymmetric residuals during the optimisation phase. Overall, Sig-Trading combines the flexibility of data-driven methods without compromising on the clarity of the classical theory and our presented results provide a compelling toolbox that yields superior results for a large class of trading strategies.

This is based on works with Blanka Horvath and Magnus Wiese.

Thu, 02 May 2024

Robust Duality for multi-action options with information delay

Dr Anna Aksamit
(University of Sydney)
Further Information

Please join us for reshments outside the lecture room from 1530.


We show the super-hedging duality for multi-action options which generalise American options to a larger space of actions (possibly uncountable) than {stop, continue}. We put ourselves in the framework of Bouchard & Nutz model relying on analytic measurable selection theorem. Finally we consider information delay on the action component of the product space. Information delay is expressed as a possibility to look into the future in the dual formulation. This is a joint work with Ivan Guo, Shidan Liu and Zhou Zhou.

Thu, 25 Apr 2024

Reinforcement Learning in near-continuous time for continuous state-action spaces

Dr Lorenzo Croissant
(CEREMADE, Université Paris-Dauphine)
Further Information

Please join us for reshments outside the lecture room from 1530.


We consider the reinforcement learning problem of controlling an unknown dynamical system to maximise the long-term average reward along a single trajectory. Most of the literature considers system interactions that occur in discrete time and discrete state-action spaces. Although this standpoint is suitable for games, it is often inadequate for systems in which interactions occur at a high frequency, if not in continuous time, or those whose state spaces are large if not inherently continuous. Perhaps the only exception is the linear quadratic framework for which results exist both in discrete and continuous time. However, its ability to handle continuous states comes with the drawback of a rigid dynamic and reward structure.

        This work aims to overcome these shortcomings by modelling interaction times with a Poisson clock of frequency $\varepsilon^{-1}$ which captures arbitrary time scales from discrete ($\varepsilon=1$) to continuous time ($\varepsilon\downarrow0$). In addition, we consider a generic reward function and model the state dynamics according to a jump process with an arbitrary transition kernel on $\mathbb{R}^d$. We show that the celebrated optimism protocol applies when the sub-tasks (learning and planning) can be performed effectively. We tackle learning by extending the eluder dimension framework and propose an approximate planning method based on a diffusive limit ($\varepsilon\downarrow0$) approximation of the jump process.

        Overall, our algorithm enjoys a regret of order $\tilde{\mathcal{O}}(\sqrt{T})$ or $\tilde{\mathcal{O}}(\varepsilon^{1/2} T+\sqrt{T})$ with the approximate planning. As the frequency of interactions blows up, the approximation error $\varepsilon^{1/2} T$ vanishes, showing that $\tilde{\mathcal{O}}(\sqrt{T})$ is attainable in near-continuous time.

Thu, 11 Apr 2024
The Auditorium, Citigroup Centre, London, E14 5LB

0DTEs: Trading, Gamma Risk and Volatility Propagation

Prof Grigory Vilkov
(Frankfurt School of Finance & Management)
Further Information

Registration is free but required. Register Here.


Investors fear that surging volumes in short-term, especially same-day expiry (0DTE), options can destabilize markets by propagating large price jumps. Contrary to the intuition that 0DTE sellers predominantly generate delta-hedging flows that aggravate market moves, high open interest gamma in 0DTEs does not propagate past volatility. 0DTEs and underlying markets have become more integrated over time, leading to a marginally stronger link between the index volatility and 0DTE trading. Nonetheless, intraday 0DTE trading volume shocks do not amplify recent past index returns, inconsistent with the view that 0DTEs market growth intensifies market fragility.

About the speaker
Grigory Vilkov, Professor of Finance at the Frankfurt School of Finance and Management, holds an MBA from the University of Rochester and a Ph.D. from INSEAD, with further qualifications from Goethe University Frankfurt. He has been a professor at both Goethe University and the University of Mannheim.
His academic work focused on improving long-term portfolio strategies by building better expectations of risks, returns, and their dynamics. He is known for practical innovations in finance, such as developing forward-looking betas marketed by IvyDB OptionMetrics, establishing implied skewness and generalized lower bounds as cross-sectional stock characteristics, and creating measures for climate change exposure from earnings calls. His current research encompasses factor dispersions, factor and sector rotation, asset allocation with implied data, and machine learning in options analysis. 

Register Here.

Thu, 07 Mar 2024

Signature Kernel Conditional Independence Tests in Causal Discovery for Stochastic Processes

Dr Emilio Ferrucci
(Mathematical Institute University of Oxford)
Further Information

Please join us for refreshments outside L3 from 1530.


Predicting real-world phenomena often requires an understanding of their causal relations, not just their statistical associations. I will begin this talk with a brief introduction to the field of causal inference in the classical case of structural causal models over directed acyclic graphs, and causal discovery for static variables. Introducing the temporal dimension results in several interesting complications which are not well handled by the classical framework. The main component of a constraint-based causal discovery procedure is a statistical hypothesis test of conditional independence (CI). We develop such a test for stochastic processes, by leveraging recent advances in signature kernels. Then, we develop constraint-based causal discovery algorithms for acyclic stochastic dynamical systems (allowing for loops) that leverage temporal information to recover the entire directed graph. Assuming faithfulness and a CI oracle, our algorithm is sound and complete. We demonstrate strictly superior performance of our proposed CI test compared to existing approaches on path-space when tested on synthetic data generated from SDEs, and discuss preliminary applications to finance. This talk is based on joint work with Georg Manten, Cecilia Casolo, Søren Wengel Mogensen, Cristopher Salvi and Niki Kilbertus: .

Thu, 29 Feb 2024

Martingale Benamou-Brenier: arthimetic and geometric Bass martingales

Professor Jan Obloj
(Mathematical Institute)
Further Information

Please join us for refreshments outside L3 from 1530.


Optimal transport (OT) proves to be a powerful tool for non-parametric calibration: it allows us to take a favourite (non-calibrated) model and project it onto the space of all calibrated (martingale) models. The dual side of the problem leads to an HJB equation and a numerical algorithm to solve the projection. However, in general, this process is costly and leads to spiky vol surfaces. We are interested in special cases where the projection can be obtained semi-analytically. This leads us to the martingale equivalent of the seminal fluid-dynamics interpretation of the optimal transport (OT) problem developed by Benamou and Brenier. Specifically, given marginals, we look for the martingale which is the closest to a given archetypical model. If our archetype is the arithmetic Brownian motion, this gives the stretched Brownian motion (or the Bass martingale), studied previously by Backhoff-Veraguas, Beiglbock, Huesmann and Kallblad (and many others). Here we consider the financially more pertinent case of Black-Scholes (geometric BM) reference and show it can also be solved explicitly. In both cases, fast numerical algorithms are available.

Based on joint works with Julio Backhoff, Benjamin Joseph and Gregoire Leoper.  

This talk reports a work in progress. It will be done on a board.

Thu, 22 Feb 2024
The Auditorium, Citigroup Centre, London, E14 5LB

Frontiers in Quantitative Finance: Statistical Predictions of Trading Strategies in Electronic Markets

Prof Samuel N Cohen

We build statistical models to describe how market participants choose the direction, price, and volume of orders. Our dataset, which spans sixteen weeks for four shares traded in Euronext Amsterdam, contains all messages sent to the exchange and includes algorithm identification and member identification. We obtain reliable out-of-sample predictions and report the top features that predict direction, price, and volume of orders sent to the exchange. The coefficients from the fitted models are used to cluster trading behaviour and we find that algorithms registered as Liquidity Providers exhibit the widest range of trading behaviour among dealing capacities. In particular, for the most liquid share in our study, we identify three types of behaviour that we call (i) directional trading, (ii) opportunistic trading, and (iii) market making, and we find that around one third of Liquidity Providers behave as market markers.

This is based on work with Álvaro Cartea, Saad Labyad, Leandro Sánchez-Betancourt and Leon van Veldhuijzen. View the working paper here.

Attendance is free of charge but requires prior online registration. To register please click here.

Thu, 15 Feb 2024

A New Solution to Time Inconsistent Stopping Problem

Yanzhao Yang
(Mathematical Insittute)
Further Information

Please join us for refreshments from 15:30 outside L3.

Time inconsistency is a situation that a plan of actions to be taken in the future that is optimal for an agent according to today's preference may not be optimal for the same agent in the future according to corresponding preference.
In this talk, we study a continuous dynamic time inconsistent stopping problem with a flow of preferences which can be in general form. We will define a solution to the problem by the rationality of the agent, and compare it with other solutions appeared in literature. Some examples with respect to specific preferences will be shown as a part of our analysis.
This is a joint work with Hanqing Jin.
Thu, 01 Feb 2024

Some mathematical results on generative diffusion models

Dr Renyuan Xu
(University of Southern California)
Further Information

Join us for refreshments from 330 outside L3.


Diffusion models, which transform noise into new data instances by reversing a Markov diffusion process, have become a cornerstone in modern generative models. A key component of these models is to learn the score function through score matching. While the practical power of diffusion models has now been widely recognized, the theoretical developments remain far from mature. Notably, it remains unclear whether gradient-based algorithms can learn the score function with a provable accuracy. In this talk, we develop a suite of non-asymptotic theory towards understanding the data generation process of diffusion models and the accuracy of score estimation. Our analysis covers both the optimization and the generalization aspects of the learning procedure, which also builds a novel connection to supervised learning and neural tangent kernels.

This is based on joint work with Yinbin Han and Meisam Razaviyayn (USC).

Thu, 25 Jan 2024

Causal transport on path space

Rui Lim
(Mathematical Insitute, Oxford)
Further Information

Join us for refreshments from 330 outside L3.


Causal optimal transport and the related adapted Wasserstein distance have recently been popularized as a more appropriate alternative to the classical Wasserstein distance in the context of stochastic analysis and mathematical finance. In this talk, we establish some interesting consequences of causality for transports on the space of continuous functions between the laws of stochastic differential equations.

We first characterize bicausal transport plans and maps between the laws of stochastic differential equations. As an application, we are able to provide necessary and sufficient conditions for bicausal transport plans to be induced by bi-causal maps. Analogous to the classical case, we show that bicausal Monge transports are dense in the set of bicausal couplings between laws of SDEs with unique strong solutions and regular coefficients.

 This is a joint work with Rama Cont.

Thu, 18 Jan 2024

Multireference Alignment for Lead-Lag Detection in Multivariate Time Series and Equity Trading

Danni Shi
(Oxford Man Institute [OMI])
Further Information

Join us for refreshments from 330 outside L3.


We introduce a methodology based on Multireference Alignment (MRA) for lead-lag detection in multivariate time series, and demonstrate its applicability in developing trading strategies. Specifically designed for low signal-to-noise ratio (SNR) scenarios, our approach estimates denoised latent signals from a set of time series. We also investigate the impact of clustering the time series on the recovery of latent signals. We demonstrate that our lead-lag detection module outperforms commonly employed cross-correlation-based methods. Furthermore, we devise a cross-sectional trading strategy that capitalizes on the lead-lag relationships uncovered by our approach and attains significant economic benefits. Promising backtesting results on daily equity returns illustrate the potential of our method in quantitative finance and suggest avenues for future research.

Thu, 07 Dec 2023
The Auditorium, Citigroup Centre, London, E14 5LB

Frontiers in Quantitative Finance: Large Language Models for Quantitative Finance

Dr Ioana Boier

This event is free but requires prior registration. To register, please click here.


In the contemporary AI landscape, Large Language Models (LLMs) stand out as game-changers. They redefine not only how we interact with computers via natural language but also how we identify and extract insights from vast, complex datasets. This presentation delves into the nuances of training and customizing LLMs, with a focus on their applications to quantitative finance.

About the speaker
Ioana Boier is a senior principal solutions architect at Nvidia. Her background is in Quantitative Finance and Computer Science. Prior to joining Nvidia, she was the Head of Quantitative Portfolio Solutions at Alphadyne Asset Management, and led research teams at Citadel LLC, BNP Paribas, and IBM T.J. Watson Research. She has a Ph.D. in Computer Science from Purdue University and is the author of over 30 peer-reviewed publications, 15 patents, and the winner of several awards for applied research delivered into products.
View her LinkedIn page


Frontiers in Quantitative Finance is brought to you by the Oxford Mathematical and Computational Finance Group and sponsored by CitiGroup and Mosaic SmartData.

Thu, 30 Nov 2023
Lecture Room 4, Mathematical Institute

Duality of causal distributionally robust optimization

Yifan Jiang
(Mathematical Institute (University of Oxford))

In this talk, we investigate distributionally robust optimization (DRO) in a dynamic context. We consider a general penalized DRO problem with a causal transport-type penalization. Such a penalization naturally captures the information flow generated by the models. We derive a tractable dynamic duality formula under a measure theoretic framework. Furthermore, we apply the duality to distributionally robust average value-at-risk and stochastic control problems.

Thu, 23 Nov 2023
Lecture Room 4, Mathematical Institute

Mean-field Analysis of Generalization Errors

Dr Gholamali Aminian
(Alan Turing Institute)

We propose a novel framework for exploring weak and $L_2$ generalization errors of algorithms through the lens of differential calculus on the space of probability measures. Specifically, we consider the KL-regularized empirical risk minimization problem and establish generic conditions under which the generalization error convergence rate, when training on a sample of size $n$ , is $\matcal{O}(1/n)$. In the context of supervised learning with a one-hidden layer neural network in the mean-field regime, these conditions are reflected in suitable integrability and regularity assumptions on the loss and activation functions.