Convergence of policy gradient methods for finite-horizon stochastic linear-quadratic control problems
Abstract
We study the global linear convergence of policy gradient (PG) methods for finite-horizon exploratory linear-quadratic control (LQC) problems. The setting includes stochastic LQC problems with indefinite costs and allows additional entropy regularisers in the objective. We consider a continuous-time Gaussian policy whose mean is linear in the state variable and whose covariance is state-independent. Contrary to discrete-time problems, the cost is noncoercive in the policy and not all descent directions lead to bounded iterates. We propose geometry-aware gradient descents for the mean and covariance of the policy using the Fisher geometry and the Bures-Wasserstein geometry, respectively. The policy iterates are shown to obey an a-priori bound, and converge globally to the optimal policy with a linear rate. We further propose a novel PG method with discrete-time policies. The algorithm leverages the continuous-time analysis, and achieves a robust linear convergence across different action frequencies. A numerical experiment confirms the convergence and robustness of the proposed algorithm.
This is joint work with Yufei Zhang and Christoph Reisinger.
Like TV ballroom dancing, the Eurovision Song Contest survived ridicule by becoming ridiculous. However, it has thrown up some talented winners. Remember Diggi-loo Diggi-Ley by Herreys?
France Gall was French but won in 1965 when representing Luxembourg. This track wasn't her winning effort but is superior and has a great video, 20 years before MTV. It was written by Serge Gainsbourg, last week's Song of the Week artist.
14:00
Compactification of 6d N=(1,0) quivers, 4d SCFTs and their holographic dual Massive IIA backgrounds
Abstract
We study an infinite family of Massive Type IIA backgrounds that holographically describe the twisted compactification of N=(1,0) six-dimensional SCFTs to four dimensions. The analysis of the branes involved suggests a four dimensional linear quiver QFT, that deconstructs the theory in six dimensions. For the case in which the system reaches a strongly coupled fixed point, we calculate some observables that we compare with holographic results. Two quantities measuring the number of degrees of freedom for the flow across dimensions are studied.