Forthcoming events in this series


Mon, 29 Apr 2024

11:00 - 12:00
Lecture Room 3

Deep Gaussian processes: theory and applications

Aretha Teckentrup
(University of Edinburgh)
Further Information

Please note that this seminar starts at 11am and finishes at 12pm. 

Abstract

Deep Gaussian processes have proved remarkably successful as a tool for various statistical inference tasks. This success relates in part to the flexibility of these processes and their ability to capture complex, non-stationary behaviours. 

In this talk, we will introduce the general framework of deep Gaussian processes, in which many examples can be constructed, and demonstrate their superiority in inverse problems including computational imaging and regression.

 We will discuss recent algorithmic developments for efficient sampling, as well as recent theoretical results which give crucial insight into the behaviour of the methodology.

 

Mon, 22 Apr 2024

14:00 - 15:00
Lecture Room 3

Quantization of Bandlimited Graph Signals

Hanna Veselovska
(Technical University of Munich)
Abstract

Graph signals provide a natural representation of data in many applications, such as social networks, web information analysis, sensor networks, and machine learning. Graph signal & data processing is currently an active field of mathematical research that aims to extend the well-developed tools for analyzing conventional signals to signals on graphs while exploiting the underlying connectivity. A key challenge in this context is the problem of quantization, that is, finding efficient ways of representing the values of graph signals with only a finite number of bits.

In this talk, we address the problem of quantizing bandlimited graph signals. We introduce two classes of noise-shaping algorithms for graph signals that differ in their sampling methodologies. We demonstrate that these algorithms can efficiently construct quantized representatives of bandlimited graph-based signals with bounded amplitude.

Inspired by the results of Zhang et al. in 2022, we provide theoretical guarantees on the relative error between the true signal and its quantized representative for one of the algorithms.
As will be discussed, the incoherence of the underlying graph plays an important role in the quantization process. Namely, bandlimited signals supported on graphs of lower incoherence allow for smaller relative errors. We support our findings with various numerical experiments showcasing the performance of the proposed quantization algorithms for bandlimited signals defined on graphs with different degrees of incoherence.

This is joint work with Felix Krahmer (Technical University of Munich), He Lyu (Meta), Rayan Saab (University of California San Diego), and Rongrong Wang (Michigan State University).

Mon, 08 Apr 2024

11:00 - 12:00
Lecture Room 3

Heavy-Tailed Large Deviations and Sharp Characterization of Global Dynamics of SGDs in Deep Learning

Chang-Han Rhee
(Northwestern University, USA)
Abstract

While the typical behaviors of stochastic systems are often deceptively oblivious to the tail distributions of the underlying uncertainties, the ways rare events arise are vastly different depending on whether the underlying tail distributions are light-tailed or heavy-tailed. Roughly speaking, in light-tailed settings, a system-wide rare event arises because everything goes wrong a little bit as if the entire system has conspired up to provoke the rare event (conspiracy principle), whereas, in heavy-tailed settings, a system-wide rare event arises because a small number of components fail catastrophically (catastrophe principle). In the first part of this talk, I will introduce the recent developments in the theory of large deviations for heavy-tailed stochastic processes at the sample path level and rigorously characterize the catastrophe principle for such processes. 
The empirical success of deep learning is often attributed to the mysterious ability of stochastic gradient descents (SGDs) to avoid sharp local minima in the loss landscape, as sharp minima are believed to lead to poor generalization. To unravel this mystery and potentially further enhance such capability of SGDs, it is imperative to go beyond the traditional local convergence analysis and obtain a comprehensive understanding of SGDs' global dynamics within complex non-convex loss landscapes. In the second part of this talk, I will characterize the global dynamics of SGDs building on the heavy-tailed large deviations and local stability framework developed in the first part. This leads to the heavy-tailed counterparts of the classical Freidlin-Wentzell and Eyring-Kramers theories. Moreover, we reveal a fascinating phenomenon in deep learning: by injecting and then truncating heavy-tailed noises during the training phase, SGD can almost completely avoid sharp minima and hence achieve better generalization performance for the test data.  

This talk is based on the joint work with Mihail Bazhba, Jose Blanchet, Bohan Chen, Sewoong Oh, Zhe Su, Xingyu Wang, and Bert Zwart.

 

 

Bio:

Chang-Han Rhee is an Assistant Professor in Industrial Engineering and Management Sciences at Northwestern University. Before joining Northwestern University, he was a postdoctoral researcher at Centrum Wiskunde & Informatica and Georgia Tech. He received his Ph.D. from Stanford University. His research interests include applied probability, stochastic simulation, experimental design, and the theoretical foundation of machine learning. His research has been recognized with the 2016 INFORMS Simulation Society Outstanding Publication Award, the 2012 Winter Simulation Conference Best Student Paper Award, the 2023 INFORMS George Nicholson Student Paper Competition (2nd place), and the 2013 INFORMS George Nicholson Student Paper Competition (finalist). Since 2022, his research has been supported by the NSF CAREER Award.  
 

Mon, 04 Mar 2024

14:00 - 15:00
Lecture Room 3

On transport methods for simulation-based inference and data assimilation

Prof Youssef Marzouk
(MIT)
Abstract

Many practical Bayesian inference problems fall into the simulation-based or "likelihood-free" setting, where evaluations of the likelihood function or prior density are unavailable or intractable; instead one can only draw samples from the joint parameter-data prior. Learning conditional distributions is essential to the solution of these problems. 
To this end, I will discuss a powerful class of methods for conditional density estimation and conditional simulation based on transportation of measure. An important application for these methods lies in data assimilation for dynamical systems, where transport enables new approaches to nonlinear filtering and smoothing. 
To illuminate some of the theoretical underpinnings of these methods, I will discuss recent work on monotone map representations, optimization guarantees for learning maps from data, and the statistical convergence of transport-based density estimators.