Maths & Stats Colloquium
Abstract
Professor Andrew Saxe will talk about; 'Demystifying depth: principles of learning in deep neural networks'
Deep neural networks have revolutionized artificial intelligence, yet their inner workings remain poorly understood. This talk presents mathematical analyses of the nonlinear dynamics of learning in several solvable deep network models, offering theoretical insights into the role of depth. These models reveal how learning algorithms, data structure, initialization schemes, and architectural choices interact to produce hidden representations that afford complex generalization behaviours. A recurring theme across these analyses is a neural race: competing pathways within a deep network vie to explain the data, with an implicit bias toward shared representations. These shared representations in turn shape the network’s capacity for systematic generalization, multitasking, and transfer learning. I will show how such principles manifest across diverse architectures—including feedforward and linear attention networks. Together, these results provide analytic foundations for understanding how environmental statistics, network architecture, and learning dynamics jointly structure the emergence of neural representations and behaviour.
Bio:
Andrew Saxe is a Professor of Theoretical Neuroscience and Machine Learning at the Gatsby Computational Neuroscience Unit and Sainsbury Wellcome Centre at UCL, and a Visiting Professor at Wits University. His research seeks to unravel the computational principles governing learning in artificial and biological systems. To do so, his work draws on a range of applied mathematics in order to understand modern ‘deep’ artificial neural networks and develop theories for experimental domains in neuroscience and psychology. His work has been recognized by the Robert J. Glushko Dissertation Prize from the Cognitive Science Society, a Schmidt Science Polymath award, and the Blavatnik UK Finalist Award in Life Sciences. He is a CIFAR Fellow in the Learning in Machines & Brains program.