Date
Thu, 12 Mar 2020
Time
16:00 - 17:00
Location
L4
Speaker
Yufei Zhang

In this talk, we shall propose a relaxed control regularization with general exploration rewards to design robust feedback controls for multi-dimensional continuous-time stochastic exit time problems. We establish that the regularized control problem admits a H\”{o}lder continuous feedback control, and demonstrate that both the value function and the feedback control of the regularized control problem are Lipschitz stable with respect to parameter perturbations. Moreover, we show that a pre-computed feedback relaxed control has a robust performance in a perturbed system, and derive a first-order sensitivity equation for both the value function and optimal feedback relaxed control. These stability results provide a theoretical justification for recent reinforcement learning heuristics that including an exploration reward in the optimization objective leads to more robust decision making. We finally prove first-order monotone convergence of the value functions for relaxed control problems with vanishing exploration parameters, which subsequently enables us to construct the pure exploitation strategy of the original control problem based on the feedback relaxed controls. This is joint work with Christoph Reisinger (available at https://arxiv.org/abs/2001.03148).
 

Please contact us with feedback and comments about this page. Last updated on 03 Apr 2022 01:32.