Topological Perspectives to Characterizing Generalization in Deep Neural Networks
Dr. Tolga Birdal is an Assistant Professor in the Department of Computing at Imperial College London, with prior experience as a Senior Postdoctoral Research Fellow at Stanford University in Prof. Leonidas Guibas's Geometric Computing Group. Tolga has defended his master's and Ph.D. theses at the Computer Vision Group under Chair for Computer Aided Medical Procedures, Technical University of Munich led by Prof. Nassir Navab. He was also a Doktorand at Siemens AG under supervision of Dr. Slobodan Ilic working on “Geometric Methods for 3D Reconstruction from Large Point Clouds”. His research interests center on geometric machine learning and 3D computer vision, with a theoretical focus on exploring the boundaries of geometric computing, non-Euclidean inference, and the foundations of deep learning. Dr. Birdal has published extensively in leading academic journals and conference proceedings, including NeurIPS, CVPR, ICLR, ICCV, ECCV, T-PAMI, and IJCV. Aside from his academic life, Tolga has co-founded multiple companies including Befunky, a widely used web-based image editing platform.
Abstract
Training deep learning models involves searching for a good model over the space of possible architectures and their parameters. Discovering models that exhibit robust generalization to unseen data and tasks is of paramount for accurate and reliable machine learning. Generalization, a hallmark of model efficacy, is conventionally gauged by a model's performance on data beyond its training set. Yet, the reliance on vast training datasets raises a pivotal question: how can deep learning models transcend the notorious hurdle of 'memorization' to generalize effectively? Is it feasible to assess and guarantee the generalization prowess of deep neural networks in advance of empirical testing, and notably, without any recourse to test data? This inquiry is not merely theoretical; it underpins the practical utility of deep learning across myriad applications. In this talk, I will show that scrutinizing the training dynamics of neural networks through the lens of topology, specifically using 'persistent-homology dimension', leads to novel bounds on the generalization gap and can help demystifying the inner workings of neural networks. Our work bridges deep learning with the abstract realms of topology and learning theory, while relating to information theory through compression.
16:00
Some mathematical results on generative diffusion models
Join us for refreshments from 330 outside L3.
Abstract
Diffusion models, which transform noise into new data instances by reversing a Markov diffusion process, have become a cornerstone in modern generative models. A key component of these models is to learn the score function through score matching. While the practical power of diffusion models has now been widely recognized, the theoretical developments remain far from mature. Notably, it remains unclear whether gradient-based algorithms can learn the score function with a provable accuracy. In this talk, we develop a suite of non-asymptotic theory towards understanding the data generation process of diffusion models and the accuracy of score estimation. Our analysis covers both the optimization and the generalization aspects of the learning procedure, which also builds a novel connection to supervised learning and neural tangent kernels.
This is based on joint work with Yinbin Han and Meisam Razaviyayn (USC).
16:00
Multireference Alignment for Lead-Lag Detection in Multivariate Time Series and Equity Trading
Join us for refreshments from 330 outside L3.
Abstract
We introduce a methodology based on Multireference Alignment (MRA) for lead-lag detection in multivariate time series, and demonstrate its applicability in developing trading strategies. Specifically designed for low signal-to-noise ratio (SNR) scenarios, our approach estimates denoised latent signals from a set of time series. We also investigate the impact of clustering the time series on the recovery of latent signals. We demonstrate that our lead-lag detection module outperforms commonly employed cross-correlation-based methods. Furthermore, we devise a cross-sectional trading strategy that capitalizes on the lead-lag relationships uncovered by our approach and attains significant economic benefits. Promising backtesting results on daily equity returns illustrate the potential of our method in quantitative finance and suggest avenues for future research.
The first of six 'Probability' lectures we are showing, taken from the first year undergraduate course, is now available to watch.
The First Year Probability lectures are for students of Mathematics, Computer Science and joint degree courses between Mathematics and Statistics and Mathematics and Philosophy. First year lectures are supported by lecture notes and complemented by one problem sheet for every two lectures, which students are asked to solve in preparation for discussion in pairs with their tutors in tutorials.
Holography is one of a set of powerful tools which theoretical physicists use to understand the fundamental aspects of nature. The holographic principle states that the entire information content of a theory of quantum gravity in some volume is equivalent (or dual) to a theory living at the boundary of the volume without gravity. The boundary degrees of freedom encode all the bulk degrees of freedom and their dynamics and vice versa.
12:00
Well-posedness of nonlocal aggregation-diffusion equations and systems with irregular kernels
Abstract
Aggregation-diffusion equations and systems have garnered much attention in the last few decades. More recently, models featuring nonlocal interactions through spatial convolution have been applied to several areas, including the physical, chemical, and biological sciences. Typically, one can establish the well-posedness of such models via regularity assumptions on the kernels themselves; however, more effort is required for many scenarios of interest as the nonlocal kernel is often discontinuous.
In this talk, I will present recent progress in establishing a robust well-posedness theory for a class of nonlocal aggregation-diffusion models with minimal regularity requirements on the interaction kernel in any spatial dimension on either the whole space or the torus. Starting with the scalar equation, we first establish the existence of a global weak solution in a small mass regime for merely bounded kernels. Under some additional hypotheses, we show the existence of a global weak solution for any initial mass. In typical cases of interest, these solutions are unique and classical. I will then discuss the generalisation to the $n$-species system for the regimes of small mass and arbitrary mass. We will conclude with some consequences of these theorems for several models typically found in ecological applications.
This is joint work with Dr. Jakub Skrzeczkowski and Prof. Jose Carrillo.