Date
Tue, 27 Feb 2024
11:00
Location
L5
Speaker
Harrison Waldon
Organisation
Oxford Man Institute

This paper presents the (Adaptive) Iterative Linear Quadratic Regulator Deep Galerkin Method (AIR-DGM), a novel approach for solving optimal control (OC) problems in dynamic and uncertain environments. Traditional OC methods face challenges in scalability and adaptability due to the curse-of-dimensionality and reliance on accurate models. Model Predictive Control (MPC) addresses these issues but is limited to open-loop controls. With (A)ILQR-DGM, we combine deep learning with OC to compute closed-loop control policies that adapt to changing dynamics. Our methodology is split into two phases; offline and online. In the offline phase, ILQR-DGM computes globally optimal control by minimizing a variational formulation of the Hamilton-Jacobi-Bellman (HJB) equation. To improve performance over DGM (Sirignano & Spiliopoulos, 2018), ILQR-DGM uses the ILQR method (Todorov & Li, 2005) to initialize the value function and policy networks. In the online phase, AIR-DGM solves continuously updated OC problems based on noisy observations of the environment. We provide results based on HJB stability theory to show that AIR-DGM leverages Transfer Learning (TL) to adapt the optimal policy. We test (A)ILQR-DGM in various setups and demonstrate its superior performance over traditional methods, especially in scenarios with misspecified priors and changing dynamics.

Please contact us with feedback and comments about this page. Last updated on 05 Feb 2024 15:24.