Seminar series
Date
Fri, 01 Mar 2024
16:00
Location
L1
Speaker
Professor Rebecca Willett (University of Chicago)

Neural network architectures play a key role in determining which functions are fit to training data and the resulting generalization properties of learned predictors. For instance, imagine training an overparameterized neural network to interpolate a set of training samples using weight decay; the network architecture will influence which interpolating function is learned. 

In this talk, I will describe new insights into the role of network depth in machine learning using the notion of representation costs – i.e., how much it “costs” for a neural network to represent some function f. Understanding representation costs helps reveal the role of network depth in machine learning. First, we will see that there is a family of functions that can be learned with depth-3 networks when the number of samples is polynomial in the input dimension d, but which cannot be learned with depth-2 networks unless the number of samples is exponential in d. Furthermore, no functions can easily be learned with depth-2 networks while being difficult to learn with depth-3 networks. 

Together, these results mean deeper networks have an unambiguous advantage over shallower networks in terms of sample complexity. Second, I will show that adding linear layers to a ReLU network yields a representation cost that favors functions with latent low-dimension structure, such as single- and multi-index models. Together, these results highlight the role of network depth from a function space perspective and yield new tools for understanding neural network generalization. 

Further Information

Rebecca Willett is a Professor of Statistics and Computer Science & the Faculty Director of AI at the Data Science Institute, with a courtesy appointment at the Toyota Technological Institute at Chicago. Her research is focused on machine learning foundations, scientific machine learning, and signal processing. She is the Deputy Director for Research at the NSF-Simons Foundation National Institute for Theory and Mathematics in Biology and a member of the Executive Committee for the NSF Institute for the Foundations of Data Science. She is the Faculty Director of the Eric and Wendy Schmidt AI in Science Postdoctoral Fellowship and helps direct the Air Force Research Lab University Center of Excellence on Machine Learning

Please contact us with feedback and comments about this page. Last updated on 28 Feb 2024 17:07.