Deep Transfer Learning for Adaptive Model Predictive Control

Seminar series

Stochastic Analysis Internal Seminar

Date

Tue, 27 Feb 2024
11:00

Location

Speaker

Harrison Waldon

Organisation

Oxford Man Institute

This paper presents the (Adaptive) Iterative Linear Quadratic Regulator Deep Galerkin Method (AIR-DGM), a novel approach for solving optimal control (OC) problems in dynamic and uncertain environments. Traditional OC methods face challenges in scalability and adaptability due to the curse-of-dimensionality and reliance on accurate models. Model Predictive Control (MPC) addresses these issues but is limited to open-loop controls. With (A)ILQR-DGM, we combine deep learning with OC to compute closed-loop control policies that adapt to changing dynamics. Our methodology is split into two phases; offline and online. In the offline phase, ILQR-DGM computes globally optimal control by minimizing a variational formulation of the Hamilton-Jacobi-Bellman (HJB) equation. To improve performance over DGM (Sirignano & Spiliopoulos, 2018), ILQR-DGM uses the ILQR method (Todorov & Li, 2005) to initialize the value function and policy networks. In the online phase, AIR-DGM solves continuously updated OC problems based on noisy observations of the environment. We provide results based on HJB stability theory to show that AIR-DGM leverages Transfer Learning (TL) to adapt the optimal policy. We test (A)ILQR-DGM in various setups and demonstrate its superior performance over traditional methods, especially in scenarios with misspecified priors and changing dynamics.