Date
Wed, 14 Jan 2026
Time
14:00 - 15:00
Location
Lecture Room 3
Speaker
Andrew Gordon Wilson

Deep neural networks are often seen as different from other model classes by defying conventional notions of generalization. Popular
examples of anomalous generalization behaviour include benign overfitting, double descent, and the success of overparametrization.
We argue that these phenomena are not distinct to neural networks, or particularly mysterious. Moreover, this generalization behaviour can be intuitively understood, and rigorously characterized using long-standing generalization frameworks such as PAC-Bayes and countable hypothesis bounds. We present soft inductive biases as a key unifying principle in explaining these phenomena: rather than restricting the hypothesis space to avoid overfitting, embrace a flexible hypothesis space, with a soft preference for simpler solutions that are  consistent with the data. This principle can be encoded in many model classes, and thus deep learning is not as mysterious or different from other model classes as it might seem. However, we also highlight how deep learning is relatively distinct in other ways, such as its ability for representation learning, phenomena such as mode
connectivity, and its relative universality.


Bio: Andrew Gordon Wilson is a Professor at the Courant Institute of Mathematical Sciences and Center for Data Science at New York
University. He is interested in developing a prescriptive foundation for building intelligent systems. His work includes loss landscapes,
optimization, Bayesian model selection, equivariances, generalization theory, and scientific applications. His website is
https://cims.nyu.edu/~andrewgw.

Last updated on 16 Dec 2025, 4:42pm. Please contact us with feedback and comments about this page.