Universal characteristics of deep neural network loss surfaces from random matrix theory

Seminar series

Random Matrix Theory Seminars

Date

Tue, 18 Oct 2022

Time

15:30 - 16:30

Location

Speaker

Nick Baskerville

Organisation

University of Bristol

Neural networks are the most practically successful class of models in modern machine learning, but there are considerable gaps in the current theoretical understanding of their properties and success. Several authors have applied models and tools from random matrix theory to shed light on a variety of aspects of neural network theory, however the genuine applicability and relevance of these results is in question. Most works rely on modelling assumptions to reduce large, complex matrices (such as the Hessians of neural networks) to something close to a well-understood canonical RMT ensemble to which all the sophisticated machinery of RMT can be applied to yield insights and results. There is experimental work, however, that appears to contradict these assumptions. In this talk, we will explore what can be derived about neural networks starting from RMT assumptions that are much more general than considered by prior work. Our main results start from justifiable assumptions on the local statistics of neural network Hessians and make predictions about their spectra than we can test experimentally on real-world neural networks. Overall, we will argue that familiar ideas from RMT universality are at work in the background, producing practical consequences for modern deep neural networks.