Date
Thu, 09 May 2019
Time
13:00 - 14:00
Location
L4
Speaker
Theerawat Bhudisaksang & Yufei Zhang (DPhil students)

Theerawat Bhudisaksang
----------------------

Adaptive robust control with statistical learning

We extend the adaptive robust methodology introduced in Bielecki et al. and propose a continuous-time version of their approach. Bielecki et al. consider a model in which the distribution of the underlying (observable) process depends on unknown parameters and the agent uses observations of the process to estimate the parameter values. The model is made robust to misspecification because the agent employs a set of ambiguity measures that contains measures where the parameter are inside a confidence region of their estimator. In our extension, we construct the set of ambiguity measures such that each probability measure in the set has a semimartingale characterisation lies in a restricted set. Finally, we prove the dynamic programming principle of the adaptive robust control in continuous time problem using measurable selection theorems, and we show that the value function can be characterised as the solution of a non-linear partial differential equation.

Yufei Zhang
-----------

A neural network based policy iteration algorithm with global convergence of values and controls for stochastic games on domains

In this talk, we propose a class of neural network based numerical schemes for solving semi-linear Hamilton-Jacobi-Bellman-Isaacs (HJBI) boundary value problems which arise naturally from exit time problems of diffusion processes with controlled drift. We exploit a policy iteration to reduce the semilinear problem into a sequence of linear Dirichlet problems, which are subsequently approximated by a multilayer feedforward neural network ansatz. We establish that the numerical solutions converge globally in the H^2-norm, and further demonstrate that this convergence is superlinear, by interpreting the algorithm as an inexact Newton iteration for the HJBI equation. Moreover, we construct the optimal feedback controls from the numerical value functions and deduce convergence. The numerical schemes and convergence results are then extended to HJBI boundary value problems corresponding to controlled diffusion processes with oblique boundary reflection. Numerical experiments on the stochastic Zermelo navigation problem are presented to illustrate the theoretical results and to demonstrate the effectiveness of the method. 
 

Please contact us with feedback and comments about this page. Last updated on 03 Apr 2022 01:32.