Date
Mon, 25 Nov 2024
Time
14:00 - 15:00
Location
Lecture Room 3
Speaker
Laura Palagi
Organisation
Sapienza University of Rome

We consider minimizing the sum of a large number of smooth and possibly non-convex functions, which is the typical problem encountered in the training of deep neural networks on large-size datasets. 

Improving the Controlled Minibatch Algorithm (CMA) scheme proposed by Liuzzi et al. (2022), we propose CMALight, an ease-controlled incremental gradient (IG)-like method. The control of the IG iteration is performed by means of a costless watchdog rule and a derivative-free line search that activates only sporadically to guarantee convergence. The schemes also allow controlling the updating of the learning rate used in the main IG iteration, avoiding the use of preset rules, thus overcoming another tricky aspect in implementing online methods.

Convergence to a stationary point holds under the lonely assumption of Lipschitz continuity of the gradients of the component functions without knowing the Lipschitz constant or imposing any growth assumptions on the norm of the gradients.

We present two sets of computational tests. First, we compare CMALight against state-of-the-art mini-batch algorithms for training standard deep networks on large-size datasets, and deep convolutional neural networks and residual networks on standard image classification tasks on CIFAR10 and CIFAR100. 

Results shows that CMALight easily scales up to problem with order of millions  variables and has an advantage over its state-of-the-art competitors.

Finally, we present computational results on generative tasks, testing CMALight scaling capabilities on image generation with diffusion models (U-Net architecture). CMA Light achieves better test performances and is more efficient than standard SGD with weight decay, thus reducing the computational burden (and the carbon footprint of the training process).

Laura Palagi, @email

Department of Computer, Control and Management Engineering,

Sapienza University of Rome, Italy

 

Joint work with 

Corrado Coppola, @email

Giampaolo Liuzzi, @email

Lorenzo Ciarpaglini, @email

 

 

Last updated on 9 Sep 2024, 8:38am. Please contact us with feedback and comments about this page.