Mathematical Institute

Fri, 19 Jun 2026

16:00 - 17:00

Lecture Room 3

Maths & Stats Colloquium

Prof Andrew Saxe

(UCL Gatsby Computational Neuroscience Unit)

Abstract

Professor Andrew Saxe will talk about; 'Demystifying depth: principles of learning in deep neural networks'

Deep neural networks have revolutionized artificial intelligence, yet their inner workings remain poorly understood. This talk presents mathematical analyses of the nonlinear dynamics of learning in several solvable deep network models, offering theoretical insights into the role of depth. These models reveal how learning algorithms, data structure, initialization schemes, and architectural choices interact to produce hidden representations that afford complex generalization behaviours. A recurring theme across these analyses is a neural race: competing pathways within a deep network vie to explain the data, with an implicit bias toward shared representations. These shared representations in turn shape the network’s capacity for systematic generalization, multitasking, and transfer learning. I will show how such principles manifest across diverse architectures—including feedforward and linear attention networks. Together, these results provide analytic foundations for understanding how environmental statistics, network architecture, and learning dynamics jointly structure the emergence of neural representations and behaviour.

Further Information

Bio:
Andrew Saxe is a Professor of Theoretical Neuroscience and Machine Learning at the Gatsby Computational Neuroscience Unit and Sainsbury Wellcome Centre at UCL, and a Visiting Professor at Wits University. His research seeks to unravel the computational principles governing learning in artificial and biological systems. To do so, his work draws on a range of applied mathematics in order to understand modern ‘deep’ artificial neural networks and develop theories for experimental domains in neuroscience and psychology. His work has been recognized by the Robert J. Glushko Dissertation Prize from the Cognitive Science Society, a Schmidt Science Polymath award, and the Blavatnik UK Finalist Award in Life Sciences. He is a CIFAR Fellow in the Learning in Machines & Brains program.

Add to calendar

Improved measurements of the TeV-PeV extragalactic neutrino spectrum from joint analyses of IceCube tracks and cascades

Abbasi, R Ackermann, M Adams, J Agarwalla, S Aguilar, J Ahlers, M Alameddine, J Ali, S Amin, N Andeen, K Arguelles, C Ashida, Y Athanasiadou, S Axani, S Babu, R Bai, X Baines-Holmes, J V., A Barwick, S Bash, S Basu, V Bay, R Beatty, J Tjus, J Behrens, P Beise, J Bellenghi, C Benkel, B BenZvi, S Berley, D Bernardini, E Besson, D Blaufuss, E Bloom, L Blot, S Bodo, I Bontempo, F Motzkin, J Meneguolo, C Boser, S Botner, O Bottcher, J Braun, J Brinson, B Brisson-Tsavoussis, Z Burley, R Butterfield, D Campana, M Carloni, K Carpio, J Chattopadhyay, S Chau, N Chen, Z Chirkin, D Choi, S Clark, B Coleman, A Coleman, P Collin, G Borja, D Connolly, A Conrad, J Corley, R Cowen, D De Clercq, C DeLaunay, J Delgado, D Delmeulle, T Deng, S Desiati, P de Vries, K de Wasseige, G DeYoung, T Diaz-Velez, J DiKerby, S Dittmer, M Domi, A Draper, L Dueser, L Durnford, D Dutta, K DuVernois, M Ehrhardt, T Eidenschink, L Eimer, A Eller, P Ellinger, E Elsasser, D Engel, R Erpenbeck, H Esmail, W Eulig, S Evans, J Evenson, P Fan, K Fang, K Farrag, K Fazely, A Fedynitch, A Feigl, N Finley, C Fischer, L Fox, D Franckowiak, A Fukami, S Furst, P Gallagher, J Ganster, E Garcia, A Garcia, M Garg, G Genton, E Gerhardt, L Ghadimi, A Glaser, C Glusenkamp, T Gonzalez, J Goswami, S Granados, A Grant, D Gray, S Griffin, S Griswold, S Groth, K Guevel, D Gunther, C Gutjahr, P Ha, C Haack, C Hallgren, A Halve, L Halzen, F Hamacher, L Minh, M Handt, M Hanson, K Hardin, J Harnisch, A Hatch, P Haungs, A Haussler, J Helbing, K Hellrung, J Henke, B Hennig, L Henningsen, F Heuermann, L Hewett, R Heyer, N Hickford, S Hidvegi, A Hill, C Hill, G Hmaid, R Hoffman, K Hooper, D Hori, S Hoshina, K Hostert, M Hou, W Huber, T Hultqvist, K Hymon, K Ishihara, A Iwakiri, W Jacquart, M Jain, S Janik, O Jansson, M Jeong, M Jin, M Kamp, N Kang, D Kang, W Kang, X Kappes, A Kardum, L Karg, T Karl, M Karle, A Katil, A Kauer, M Kelley, J Khanal, M Zathul, A Kheirandish, A Kimku, H Kiryluk, J Klein, C Klein, S Kobayashi, Y Kochocki, A Koirala, R Kolanoski, H Kontrimas, T Kopke, L Kopper, C Koskinen, D Koundal, P Kowalski, M Kozynets, T Krieger, N Krishnamoorthi, J Krishnan, T Kruiswijk, K Krupczak, E Kumar, A Kun, E Kurahashi, N Lad, N Gualda, C Arnaud, L Lamoureux, M Larson, M Lauber, F Lazar, J DeHolton, K Leszczynska, A Liao, J Lin, C Liu, Y Liubarska, M Love, C Lu, L Lucarelli, F Luszczak, W Lyu, Y Madsen, J Magnus, E Makino, Y Manao, E Mancina, S Mand, A Maris, I Marka, S Marka, Z Marten, L Martinez-Soler, I Maruyama, R Mauro, J Mayhew, F McNally, F Mead, J Meagher, K Mechbal, S Medina, A Meier, M Merckx, Y Merten, L Mitchell, J Molchany, L Montaruli, T Moore, R Morii, Y Mosbrugger, A Moulai, M Mousadi, D Moyaux, E Mukherjee, T Naab, R Nakos, M Naumann, U Necker, J Neste, L Neumann, M Niederhausen, H Nisa, M Noda, K Noell, A Novikov, A Pollmann, A O’Dell, V Olivas, A Orsoe, R Osborn, J O’Sullivan, E Palusova, V Pandya, H Parenti, A Park, N Parrish, V Paudel, E Paul, L de los Heros, C Pernice, T Peterson, J Plum, M Ponten, A Poojyam, V Popovych, Y Rodriguez, M Pries, B Procter-Murphy, R Przybylski, G Pyras, L Raab, C Rack-Helleis, J Rad, N Ravn, M Rawlins, K Rechav, Z Rehman, A Reistroffer, I Resconi, E Reusch, S Rho, C Rhode, W Ricca, L Riedel, B Rifaie, A Roberts, E Robertson, S Rongen, M Rosted, A Rott, C Ruhe, T Ruohan, L Ryckbosch, D Saffer, J Salazar-Gallegos, D Sampathkumar, P Sandrock, A Sanger-Johnson, G Santander, M Sarkar, S Savelberg, J Scarnera, M Schaile, P Schaufel, M Schieler, H Schindler, S Schlickmann, L Schluter, B Schluter, F Schmeisser, N Schmidt, T Schroder, F Schumacher, L Schwirn, S Sclafani, S Seckel, D Seen, L Seikh, M Seunarine, S Myhr, P Shah, R Shefali, S Shimizu, N Skrzypek, B Snihur, R Soedingrekso, J Sogaard, A Soldin, D Soldin, P Sommani, G Spannfellner, C Spiczak, G Spiering, C Stachurska, J Stamatikos, M Stanev, T Stezelberger, T Sturwald, T Stuttard, T Sullivan, G Taboada, I Ter-Antonyan, S Terliuk, A Thakuri, A Thiesmeyer, M Thompson, W Thwaites, J Tilav, S Tollefson, K Toscano, S Tosi, D Trettin, A Upadhyay, A Upshaw, K Vaidyanathan, A Valtonen-Mattila, N Valverde, J Vandenbroucke, J Van Eeden, T van Eijndhoven, N Van Rootselaar, L van Santen, J Vara, J Varsi, F Venugopal, M Vereecken, M Carrasco, S Verpoest, S Veske, D Vijai, A Villarreal, J Walck, C Wang, A Warrick, E Weaver, C Weigel, P Weindl, A Weldert, J Wen, A Wendt, C Werthebach, J Weyrauch, M Whitehorn, N Wiebusch, C Williams, D Witthaus, L Wolf, M Wrede, G Xu, X Yanez, J Yao, Y Yildizci, E Yoshida, S Young, R Yu, F Yu, S Yuan, T Zegarelli, A Zhang, S Zhang, Z Zhelnin, P Zilberman, P Physical Review D volume 113 issue 6 062002 (15 Mar 2026)

Smooth, globally Polyak-Łojasiewicz functions are nonlinear least-squares

Maths & Stats Colloquium

Mathematics of Transformers Workshop (Aug 2026)