Beyond i.i.d. weights: sparse and low-rank deep Neural Networks are also Gaussian Processes

Seminar series

Random Matrix Theory Seminars

Date

Tue, 21 Nov 2023

Time

16:00 - 17:00

Location

Speaker

Thiziri Nait Saada

Organisation

Mathematical Institute (University of Oxford)

The infinitely wide neural network has been proven a useful and manageable mathematical model that enables the understanding of many phenomena appearing in deep learning. One example is the convergence of random deep networks to Gaussian processes that enables a rigorous analysis of the way the choice of activation function and network weights impacts the training dynamics. In this paper, we extend the seminal proof of Matthews (2018) to a larger class of initial weight distributions (which we call "pseudo i.i.d."), including the established cases of i.i.d. and orthogonal weights, as well as the emerging low-rank and structured sparse settings celebrated for their computational speed-up benefits. We show that fully-connected and convolutional networks initialized with pseudo i.i.d. distributions are all effectively equivalent up to their variance. Using our results, one can identify the Edge-of-Chaos for a broader class of neural networks and tune them at criticality in order to enhance their training.