A coordinate descent algorithm on the Stiefel manifold for deep neural network training

Seminar series

Computational Mathematics and Applications Seminar

Date

Thu, 11 May 2023

Time

14:00 - 15:00

Location

Lecture Room 3

Speaker

Estelle Massart

Organisation

UC Louvain

We propose to use stochastic Riemannian coordinate descent on the Stiefel manifold for deep neural network training. The algorithm rotates successively two columns of the matrix, an operation that can be efficiently implemented as a multiplication by a Givens matrix. In the case when the coordinate is selected uniformly at random at each iteration, we prove the convergence of the proposed algorithm under standard assumptions on the loss function, stepsize and minibatch noise. Experiments on benchmark deep neural network training problems are presented to demonstrate the effectiveness of the proposed algorithm.