Smooth, globally Polyak-Łojasiewicz functions are nonlinear least-squares
Abstract
Associate Professor Nicolas Boumal will talk about: 'Smooth, globally Polyak-Łojasiewicz functions are nonlinear least-squares'
Polyak-Łojasiewicz (PŁ) functions abound in the literature, especially in nonconvex optimization. When they are also smooth, they become surprisingly simple---with an exotic twist. The plan is for us to discover the structure of those functions and of their sets of minimizers via gradient flow and fiber bundles.
Joint work with Christopher Criscitiello and Quentin Rebjock.
Maths & Stats Colloquium
Abstract
Professor Andrew Saxe will talk about; 'Demystifying depth: principles of learning in deep neural networks'
Deep neural networks have revolutionized artificial intelligence, yet their inner workings remain poorly understood. This talk presents mathematical analyses of the nonlinear dynamics of learning in several solvable deep network models, offering theoretical insights into the role of depth. These models reveal how learning algorithms, data structure, initialization schemes, and architectural choices interact to produce hidden representations that afford complex generalization behaviours. A recurring theme across these analyses is a neural race: competing pathways within a deep network vie to explain the data, with an implicit bias toward shared representations. These shared representations in turn shape the network’s capacity for systematic generalization, multitasking, and transfer learning. I will show how such principles manifest across diverse architectures—including feedforward and linear attention networks. Together, these results provide analytic foundations for understanding how environmental statistics, network architecture, and learning dynamics jointly structure the emergence of neural representations and behaviour.
Bio:
Andrew Saxe is a Professor of Theoretical Neuroscience and Machine Learning at the Gatsby Computational Neuroscience Unit and Sainsbury Wellcome Centre at UCL, and a Visiting Professor at Wits University. His research seeks to unravel the computational principles governing learning in artificial and biological systems. To do so, his work draws on a range of applied mathematics in order to understand modern ‘deep’ artificial neural networks and develop theories for experimental domains in neuroscience and psychology. His work has been recognized by the Robert J. Glushko Dissertation Prize from the Cognitive Science Society, a Schmidt Science Polymath award, and the Blavatnik UK Finalist Award in Life Sciences. He is a CIFAR Fellow in the Learning in Machines & Brains program.