L4 | Mathematical Institute

Thu, 09 May 2019

16:00 - 17:30

Deep Learning Volatility

Blanka Horvath

(Kings College London)

Abstract

We present a consistent neural network based calibration method for a number of volatility models-including the rough volatility family-that performs the calibration task within a few milliseconds for the full implied volatility surface.
The aim of neural networks in this work is an off-line approximation of complex pricing functions, which are difficult to represent or time-consuming to evaluate by other means. We highlight how this perspective opens new horizons for quantitative modelling: The calibration bottleneck posed by a slow pricing of derivative contracts is lifted. This brings several model families (such as rough volatility models) within the scope of applicability in industry practice. As customary for machine learning, the form in which information from available data is extracted and stored is crucial for network performance. With this in mind we discuss how our approach addresses the usual challenges of machine learning solutions in a financial context (availability of training data, interpretability of results for regulators, control over generalisation errors). We present specific architectures for price approximation and calibration and optimize these with respect different objectives regarding accuracy, speed and robustness. We also find that including the intermediate step of learning pricing functions of (classical or rough) models before calibration significantly improves network performance compared to direct calibration to data.

Thu, 02 May 2019

16:00 - 17:30

Equilibrium asset pricing with transaction costs

Johannes Muhle-Karbe

(Imperial College London)

Abstract

In the first part of the talk, we study risk-sharing equilibria where heterogenous agents trade subject to quadratic transaction costs. The corresponding equilibrium asset prices and trading strategies are characterised by a system of nonlinear, fully-coupled forward-backward stochastic differential equations. We show that a unique solution generally exists provided that the agents’ preferences are sufficiently similar. In a benchmark specification, the illiquidity discounts and liquidity premia observed empirically correspond to a positive relationship between transaction costs and volatility.
In the second part of the talk, we discuss how the model can be calibrated to time series of prices and the corresponding trading volume, and explain how extensions of the model with general transaction costs, for example, can be solved numerically using the deep learning approach of Han, Jentzen, and E (2018).
(Based on joint works with Martin Herdegen and Dylan Possamai, as well as with Lukas Gonon and Xiaofei Shi)

Fri, 08 Mar 2019

12:00 - 13:00

Programmatically Structured Representations for Robust Autonomy in Robots

Subramanian Ramamoorthy

(University of Edinburgh and FiveAI)

Abstract

A defining feature of robotics today is the use of learning and autonomy in the inner loop of systems that are actually being deployed in the real world, e.g., in autonomous driving or medical robotics. While it is clear that useful autonomous systems must learn to cope with a dynamic environment, requiring architectures that address the richness of the worlds in which such robots must operate, it is also equally clear that ensuring the safety of such systems is the single biggest obstacle preventing scaling up of these solutions. I will discuss an approach to system design that aims at addressing this problem by incorporating programmatic structure in the network architectures being used for policy learning. I will discuss results from two projects in this direction.

Firstly, I will present the perceptor gradients algorithm – a novel approach to learning symbolic representations based on the idea of decomposing an agent’s policy into i) a perceptor network extracting symbols from raw observation data and ii) a task encoding program which maps the input symbols to output actions. We show that the proposed algorithm is able to learn representations that can be directly fed into a Linear-Quadratic Regulator (LQR) or a general purpose A* planner. Our experimental results confirm that the perceptor gradients algorithm is able to efficiently learn transferable symbolic representations as well as generate new observations according to a semantically meaningful specification.

Next, I will describe work on learning from demonstration where the task representation is that of hybrid control systems, with emphasis on extracting models that are explicitly verifi able and easily interpreted by robot operators. Through an architecture that goes from the sensorimotor level involving fitting a sequence of controllers using sequential importance sampling under a generative switching proportional controller task model, to higher level modules that are able to induce a program for a visuomotor reaching task involving loops and conditionals from a single demonstration, we show how a robot can learn tasks such as tower building in a manner that is interpretable and eventually verifiable.

References:

1. S.V. Penkov, S. Ramamoorthy, Learning programmatically structured representations with preceptor gradients, In Proc. International Conference on Learning Representations (ICLR), 2019. http://rad.inf.ed.ac.uk/data/publications/2019/penkov2019learning.pdf

2. M. Burke, S.V. Penkov, S. Ramamoorthy, From explanation to synthesis: Compositional program induction for learning from demonstration, https://arxiv.org/abs/1902.10657

Fri, 01 Mar 2019

12:00 - 13:00

Modular, Infinite, and Other Deep Generative Models of Data

Charles Sutton

(University of Edinburgh)

Abstract

Deep generative models provide powerful tools for fitting difficult distributions such as modelling natural images. But many of these methods, including variational autoencoders (VAEs) and generative adversarial networks (GANs), can be notoriously difficult to fit.

One well-known problem is mode collapse, which means that models can learn to characterize only a few modes of the true distribution. To address this, we introduce VEEGAN, which features a reconstructor network, reversing the action of the generator by mapping from data to noise. Our training objective retains the original asymptotic consistency guarantee of GANs, and can be interpreted as a novel autoencoder loss over the noise.

Second, maximum mean discrepancy networks (MMD-nets) avoid some of the pathologies of GANs, but have not been able to match their performance. We present a new method of training MMD-nets, based on mapping the data into a lower dimensional space, in which MMD training can be more effective. We call these networks Ratio-based MMD Nets, and show that somewhat mysteriously, they have dramatically better performance than MMD nets.

A final problem is deciding how many latent components are necessary for a deep generative model to fit a certain data set. We present a nonparametric Bayesian approach to this problem, based on defining a (potentially) infinitely wide deep generative model. Fitting this model is possible by combining variational inference with a Monte Carlo method from statistical physics called Russian roulette sampling. Perhaps surprisingly, we find that this modification helps with the mode collapse problem as well.

Fri, 22 Feb 2019

12:00 - 13:00

The Maximum Mean Discrepancy for Training Generative Adversarial Networks

Arthur Gretton

(UCL Gatsby Computational Neuroscience Unit)

Abstract

Generative adversarial networks (GANs) use neural networks as generative models, creating realistic samples that mimic real-life reference samples (for instance, images of faces, bedrooms, and more). These networks require an adaptive critic function while training, to teach the networks how to move improve their samples to better match the reference data. I will describe a kernel divergence measure, the maximum mean discrepancy, which represents one such critic function. With gradient regularisation, the MMD is used to obtain current state-of-the art performance on challenging image generation tasks, including 160 × 160 CelebA and 64 × 64 ImageNet. In addition to adversarial network training, I'll discuss issues of gradient bias for GANs based on integral probability metrics, and mechanisms for benchmarking GAN performance.

Fri, 15 Feb 2019

12:00 - 13:00

Some optimisation problems in the Data Science Division at the National Physical Laboratory

Stephane Chretien

(National Physical Laboratory)

Abstract

Data science has become a topic of great interest lately and has triggered new widescale research activities around efficientl first order methods for optimisation and Bayesian sampling. The National Physical Laboratory is addressing some of these challenges with particular focus on robustness and confidence in the solution. In this talk, I will present some problems and recent results concerning i. robust learning in the presence of outliers based on the Median of Means (MoM) principle and ii. stability of the solution in super-resolution (joint work with A. Thompson and B. Toader).

Tue, 19 Feb 2019
12:00

Mysteries of isolated horizons

Jerzy Lewandowski

(University of Warsaw)

Abstract

Mysteries of isolated horizons: the Near Horizon Geometry equation, geometric characterizations of the non-extremal Kerr horizon, spacetimes foliated by non-expanding horizons.

3-dimensional null surfaces that are Killing horizons to the second order are considered. They are embedded in 4-dimensional spacetimes that satisfy the vacuum Einstein equations with arbitrary cosmological constant. Internal geometry of 2-dimensional cross sections of the horizons consists of induced metric tensor and a rotation 1-form potential. It is subject to the type D equation. The equation is interesting from the both, mathematical and physical points of view. Mathematically it involves geometry, holomorphic structures and algebraic topology. Physically, the equation knows the secrete of black holes: the only axisymmetric solutions on topological sphere correspond to the the Kerr / Kerr-de Sitter / Kerr-anti-de-Sitter non-extremal black holes or to the near horizon limit of the extremal ones. In the case of bifurcated horizons the type D equation implies another spacial symmetry. In this way the axial symmetry may be ensured without the rigidity theorem. The type D equation does not allow rotating horizons of topology different then that of the sphere (or its quotient). That completes a new local non-her theorem. The type D equation is also an integrability condition for the Near Horizon Geometry equation and leads to new results on the solution existence issue.

Tue, 05 Feb 2019
12:00

Unitarity bounds on charged/neutral state mass ratio.

Dr Congkao Wen

(QMUL)

Abstract

I will talk about the implications of UV completion of quantum gravity on the low energy spectrums. I will introduce the constraints on low-energy effective theory due to unitarity and analyticity of scattering amplitudes, in particular an infinite set of new unitarity constraints on the forward-limit limit of four-point scattering amplitudes due to the work of Arkani-Hamed et al. In three dimensions, we find the constraints imply that light states with charge-to-mass ratio z greater than 1 can only be consistent if there exists other light states, preferably neutral. Applied to the 3D Standard Model like spectrum, where the low energy couplings are dominated by the electron with z \sim 10^22, this provides a novel understanding of the need for light neutrinos.

Wed, 30 Jan 2019
15:00

Wave: A New Family of Trapdoor Preimage Sampleable Functions Based on Codes

Thomas Debris-Alazard

(INRIA Paris)

Further Information

It is a long-standing open problem to build an efficient and secure digital signature scheme based on the hardness of decoding a linear code which could compete with widespread schemes like DSA or RSA. The latter signature schemes are broken by a quantum computer with Shor’s algorithm. Code-based schemes could provide a valid quantum resistant replacement. We present here Wave the first « hash-and-sign » code-based signature scheme which strictly follows the GPV strategy which ensures universal unforgeability. It uses the family of ternary generalized $(U, U+V)$ codes. Our algorithm produces uniformly distributed signatures through a suitable rejection sampling (one rejection every 3 or 4 signatures). Furthermore, our scheme enjoys efficient signature and verification algorithms. Typically, for 128 bits of classical security, signatures are in the order of 10 thousand bits long and the public key is in the order of one megabyte.

Tue, 07 May 2019

15:30 - 16:30

Toric degenerations of Grassmannians

Fatemeh Mohammadi

(Bristol)

Abstract

Many toric degenerations and integrable systems of the Grassmannians Gr(2, n) are described by trees, or equivalently subdivisions of polygons. These degenerations can also be seen to arise from the cones of the tropicalisation of the Grassmannian. In this talk, I focus on particular combinatorial types of cones in tropical Grassmannians Gr(k,n) and prove a necessary condition for such an initial degeneration to be toric. I will present several combinatorial conjectures and computational challenges around this problem. This is based on joint works with Kristin Shaw and with Oliver Clarke.

Subscribe to L4