16:00
Hawkes-based microstructure of rough volatility model with sharp rise
Abstract
Please join us for refreshments outside the lecture room from 1530.
16:00
Path-dependent optimal transport and applications
Abstract
We extend stochastic optimal transport to path-dependent settings. The problem is to find a semimartingale measure that satisfies general path-dependent constraints, while minimising a cost function on the drift and diffusion coefficients. Duality is established and expressed via non-linear path-dependent partial differential equations (PPDEs). The technique has applications in volatility calibration, including the calibration of path-dependent derivatives, LSV models, and joint SPX-VIX models. It produces a non-parametric volatility model that localises to the features of the derivatives. Another application is in the robust pricing and hedging of American options in continuous time. This is achieved by establishing duality in a space enlarged by the stopping decisions, and showing that the extremal points of martingale measures on the enlarged space are in fact martingale measures on the original space coupled with stopping times.
Please join us for reshments outside the lecture room from 1530.
16:00
Robust Duality for multi-action options with information delay
Abstract
We show the super-hedging duality for multi-action options which generalise American options to a larger space of actions (possibly uncountable) than {stop, continue}. We put ourselves in the framework of Bouchard & Nutz model relying on analytic measurable selection theorem. Finally we consider information delay on the action component of the product space. Information delay is expressed as a possibility to look into the future in the dual formulation. This is a joint work with Ivan Guo, Shidan Liu and Zhou Zhou.
Please join us for reshments outside the lecture room from 1530.
16:00
Reinforcement Learning in near-continuous time for continuous state-action spaces
Abstract
We consider the reinforcement learning problem of controlling an unknown dynamical system to maximise the long-term average reward along a single trajectory. Most of the literature considers system interactions that occur in discrete time and discrete state-action spaces. Although this standpoint is suitable for games, it is often inadequate for systems in which interactions occur at a high frequency, if not in continuous time, or those whose state spaces are large if not inherently continuous. Perhaps the only exception is the linear quadratic framework for which results exist both in discrete and continuous time. However, its ability to handle continuous states comes with the drawback of a rigid dynamic and reward structure.
This work aims to overcome these shortcomings by modelling interaction times with a Poisson clock of frequency $\varepsilon^{-1}$ which captures arbitrary time scales from discrete ($\varepsilon=1$) to continuous time ($\varepsilon\downarrow0$). In addition, we consider a generic reward function and model the state dynamics according to a jump process with an arbitrary transition kernel on $\mathbb{R}^d$. We show that the celebrated optimism protocol applies when the sub-tasks (learning and planning) can be performed effectively. We tackle learning by extending the eluder dimension framework and propose an approximate planning method based on a diffusive limit ($\varepsilon\downarrow0$) approximation of the jump process.
Overall, our algorithm enjoys a regret of order $\tilde{\mathcal{O}}(\sqrt{T})$ or $\tilde{\mathcal{O}}(\varepsilon^{1/2} T+\sqrt{T})$ with the approximate planning. As the frequency of interactions blows up, the approximation error $\varepsilon^{1/2} T$ vanishes, showing that $\tilde{\mathcal{O}}(\sqrt{T})$ is attainable in near-continuous time.
Please join us for reshments outside the lecture room from 1530.