Mon, 08 Apr 2024

11:00 - 12:00
Lecture Room 3

Heavy-Tailed Large Deviations and Sharp Characterization of Global Dynamics of SGDs in Deep Learning

Chang-Han Rhee
(Northwestern University, USA)
Abstract

While the typical behaviors of stochastic systems are often deceptively oblivious to the tail distributions of the underlying uncertainties, the ways rare events arise are vastly different depending on whether the underlying tail distributions are light-tailed or heavy-tailed. Roughly speaking, in light-tailed settings, a system-wide rare event arises because everything goes wrong a little bit as if the entire system has conspired up to provoke the rare event (conspiracy principle), whereas, in heavy-tailed settings, a system-wide rare event arises because a small number of components fail catastrophically (catastrophe principle). In the first part of this talk, I will introduce the recent developments in the theory of large deviations for heavy-tailed stochastic processes at the sample path level and rigorously characterize the catastrophe principle for such processes. 

The empirical success of deep learning is often attributed to the mysterious ability of stochastic gradient descents (SGDs) to avoid sharp local minima in the loss landscape, as sharp minima are believed to lead to poor generalization. To unravel this mystery and potentially further enhance such capability of SGDs, it is imperative to go beyond the traditional local convergence analysis and obtain a comprehensive understanding of SGDs' global dynamics within complex non-convex loss landscapes. In the second part of this talk, I will characterize the global dynamics of SGDs building on the heavy-tailed large deviations and local stability framework developed in the first part. This leads to the heavy-tailed counterparts of the classical Freidlin-Wentzell and Eyring-Kramers theories. Moreover, we reveal a fascinating phenomenon in deep learning: by injecting and then truncating heavy-tailed noises during the training phase, SGD can almost completely avoid sharp minima and hence achieve better generalization performance for the test data.  

 

This talk is based on the joint work with Mihail Bazhba, Jose Blanchet, Bohan Chen, Sewoong Oh, Zhe Su, Xingyu Wang, and Bert Zwart.

Mon, 08 Apr 2024

11:00 - 12:00
Lecture Room 3

Heavy-Tailed Large Deviations and Sharp Characterization of Global Dynamics of SGDs in Deep Learning

Chang-Han Rhee
(Northwestern University, USA)
Abstract

While the typical behaviors of stochastic systems are often deceptively oblivious to the tail distributions of the underlying uncertainties, the ways rare events arise are vastly different depending on whether the underlying tail distributions are light-tailed or heavy-tailed. Roughly speaking, in light-tailed settings, a system-wide rare event arises because everything goes wrong a little bit as if the entire system has conspired up to provoke the rare event (conspiracy principle), whereas, in heavy-tailed settings, a system-wide rare event arises because a small number of components fail catastrophically (catastrophe principle). In the first part of this talk, I will introduce the recent developments in the theory of large deviations for heavy-tailed stochastic processes at the sample path level and rigorously characterize the catastrophe principle for such processes. 
The empirical success of deep learning is often attributed to the mysterious ability of stochastic gradient descents (SGDs) to avoid sharp local minima in the loss landscape, as sharp minima are believed to lead to poor generalization. To unravel this mystery and potentially further enhance such capability of SGDs, it is imperative to go beyond the traditional local convergence analysis and obtain a comprehensive understanding of SGDs' global dynamics within complex non-convex loss landscapes. In the second part of this talk, I will characterize the global dynamics of SGDs building on the heavy-tailed large deviations and local stability framework developed in the first part. This leads to the heavy-tailed counterparts of the classical Freidlin-Wentzell and Eyring-Kramers theories. Moreover, we reveal a fascinating phenomenon in deep learning: by injecting and then truncating heavy-tailed noises during the training phase, SGD can almost completely avoid sharp minima and hence achieve better generalization performance for the test data.  

This talk is based on the joint work with Mihail Bazhba, Jose Blanchet, Bohan Chen, Sewoong Oh, Zhe Su, Xingyu Wang, and Bert Zwart.

 

 

Bio:

Chang-Han Rhee is an Assistant Professor in Industrial Engineering and Management Sciences at Northwestern University. Before joining Northwestern University, he was a postdoctoral researcher at Centrum Wiskunde & Informatica and Georgia Tech. He received his Ph.D. from Stanford University. His research interests include applied probability, stochastic simulation, experimental design, and the theoretical foundation of machine learning. His research has been recognized with the 2016 INFORMS Simulation Society Outstanding Publication Award, the 2012 Winter Simulation Conference Best Student Paper Award, the 2023 INFORMS George Nicholson Student Paper Competition (2nd place), and the 2013 INFORMS George Nicholson Student Paper Competition (finalist). Since 2022, his research has been supported by the NSF CAREER Award.  
 

Characterization of the Astrophysical Diffuse Neutrino Flux using
Starting Track Events in IceCube
Abbasi, R Ackermann, M Adams, J Agarwalla, S Aguilar, J Ahlers, M Alameddine, J Amin, N Andeen, K Anton, G Argüelles, C Ashida, Y Athanasiadou, S Ausborm, L Axani, S Bai, X V, A Baricevic, M Barwick, S Bash, S Basu, V Bay, R Beatty, J Tjus, J Beise, J Bellenghi, C Benning, C BenZvi, S Berley, D Bernardini, E Besson, D Blaufuss, E Blot, S Bontempo, F Book, J Meneguolo, C Böser, S Botner, O Böttcher, J Braun, J Brinson, B Brostean-Kaiser, J Brusa, L Burley, R Busse, R Butterfield, D Campana, M Caracas, I Carloni, K Carpio, J Chattopadhyay, S Chau, N Chen, Z Chirkin, D Choi, S Clark, B Coleman, A Collin, G Connolly, A Conrad, J Coppin, P Corley, R Correa, P Cowen, D Dave, P Clercq, C DeLaunay, J Delgado, D Deng, S Deoskar, K Desai, A Desiati, P Vries, K Wasseige, G DeYoung, T Diaz, A Díaz-Vélez, J Dittmer, M Domi, A Draper, L Dujmovic, H Dutta, K DuVernois, M Ehrhardt, T Eidenschink, L Eimer, A Eller, P Ellinger, E Mentawi, S Elsässer, D Engel, R Erpenbeck, H Evans, J Evenson, P Fan, K Fang, K Farrag, K Fazely, A Fedynitch, A Feigl, N Fiedlschuster, S Finley, C Fischer, L Fox, D Franckowiak, A Fürst, P Gallagher, J Ganster, E Garcia, A Genton, E Gerhardt, L Ghadimi, A Girard-Carillo, C Glaser, C Glüsenkamp, T Gonzalez, J Goswami, S Granados, A Grant, D Gray, S Gries, O Griffin, S Griswold, S Groth, K Günther, C Gutjahr, P Ha, C Haack, C Hallgren, A Halliday, R Halve, L Halzen, F Hamdaoui, H Minh, M Handt, M Hanson, K Hardin, J Harnisch, A Hatch, P Haungs, A Häußler, J Helbing, K Hellrung, J Hermannsgabner, J Heuermann, L Heyer, N Hickford, S Hidvegi, A Hill, C Hill, G Hoffman, K Hori, S Hoshina, K Hostert, M Hou, W Huber, T Hultqvist, K Hünnefeld, M Hussain, R Hymon, K Ishihara, A Iwakiri, W Jacquart, M Janik, O Jansson, M Japaridze, G Jeong, M Jin, M Jones, B Kamp, N Kang, D Kang, W Kang, X Kappes, A Kappesser, D Kardum, L Karg, T Karl, M Karle, A Katil, A Katz, U Kauer, M Kelley, J Khanal, M Zathul, A Kheirandish, A Kiryluk, J Klein, S Kochocki, A Koirala, R Kolanoski, H Kontrimas, T Köpke, L Kopper, C Koskinen, D Koundal, P Kovacevich, M Kowalski, M Kozynets, T Krishnamoorthi, J Kruiswijk, K Krupczak, E Kumar, A Kun, E Kurahashi, N Lad, N Gualda, C Lamoureux, M Larson, M Latseva, S Lauber, F Lazar, J Lee, J DeHolton, K Leszczyńska, A Liao, J Lincetto, M Liubarska, M Lohfink, E Love, C Mariscal, C Lu, L Lucarelli, F Luszczak, W Lyu, Y Madsen, J Magnus, E Mahn, K Makino, Y Manao, E Mancina, S Sainte, W Mariş, I Marka, S Marka, Z Marsee, M Martinez-Soler, I Maruyama, R Mayhew, F McElroy, T McNally, F Mead, J Meagher, K Mechbal, S Medina, A Meier, M Merckx, Y Merten, L Micallef, J Mitchell, J Montaruli, T Moore, R Morii, Y Morse, R Moulai, M Mukherjee, T Naab, R Nagai, R Nakos, M Naumann, U Necker, J Negi, A Neumann, M Niederhausen, H Nisa, M Noell, A Novikov, A Nowicki, S Pollmann, A O'Dell, V Oeyen, B Olivas, A Orsoe, R Osborn, J O'Sullivan, E Pandya, H Park, N Parker, G Paudel, E Paul, L Heros, C Pernice, T Peterson, J Philippen, S Pizzuto, A Plum, M Pontén, A Popovych, Y Rodriguez, M Pries, B Procter-Murphy, R Przybylski, G Raab, C Rack-Helleis, J Rawlins, K Rechav, Z Rehman, A Reichherzer, P Resconi, E Reusch, S Rhode, W Riedel, B Rifaie, A Roberts, E Robertson, S Rodan, S Roellinghoff, G Rongen, M Rosted, A Rott, C Ruhe, T Ruohan, L Ryckbosch, D Safa, I Saffer, J Salazar-Gallegos, D Sampathkumar, P Sandrock, A Santander, M Sarkar, S Savelberg, J Savina, P Schaile, P Schaufel, M Schieler, H Schindler, S Schlüter, B Schlüter, F Schmeisser, N Schmidt, T Schneider, J Schröder, F Schumacher, L Sclafani, S Seckel, D Seikh, M Seo, M Seunarine, S Myhr, P Shah, R Shefali, S Shimizu, N Silva, M Skrzypek, B Smithers, B Snihur, R Soedingrekso, J Søgaard, A Soldin, D Soldin, P Sommani, G Spannfellner, C Spiczak, G Spiering, C Stamatikos, M Stanev, T Stezelberger, T Stürwald, T Stuttard, T Sullivan, G Taboada, I Ter-Antonyan, S Terliuk, A Thiesmeyer, M Thompson, W Thwaites, J Tilav, S Tollefson, K Tönnis, C Toscano, S Tosi, D Trettin, A Turcotte, R Twagirayezu, J Elorrieta, M Upadhyay, A Upshaw, K Vaidyanathan, A Valtonen-Mattila, N Vandenbroucke, J Eijndhoven, N Vannerom, D Santen, J Vara, J Veitch-Michaelis, J Venugopal, M Vereecken, M Verpoest, S Veske, D Vijai, A Walck, C Wang, A Weaver, C Weigel, P Weindl, A Weldert, J Wen, A Wendt, C Werthebach, J Weyrauch, M Whitehorn, N Wiebusch, C Williams, D Witthaus, L Wolf, A Wolf, M Wrede, G Xu, X Yanez, J Yildizci, E Yoshida, S Young, R Yu, S Yuan, T Zhang, Z Zhelnin, P Zilberman, P Zimmerman, M (28 Feb 2024) http://arxiv.org/abs/2402.18026v1
Tue, 14 May 2024
11:00
L5

A graph discretized approximation of diffusions with drift and killing on a complete Riemannian manifold

Hiroshi Kawabi
(Keio University)
Abstract

In this talk, we present a graph discretized approximation scheme for diffusions with drift and killing on a complete Riemannian manifold M. More precisely, for a given Schrödinger operator with drift on M having the form A = Δ b + V , we introduce a family of discrete time random walks in the  ow generated by the drift b with killing on a sequence of proximity graphs, which are constructed by partitions cutting M into small pieces. As a main result, we prove that the drifted Schrodinger semigroup {e—tA}t≥0 is approximated by discrete semigroups generated by the family of random walks with a suitable scale change. This result gives a  nite dimensional summation approximation of a Feynman-Kac type functional integral over M. Furthermore, when M is compact, we also obtain a quantitative error estimate of the convergence.
This talk is based on a joint work with Satoshi Ishiwata (Yamagata University), and the full paper can be found on https://doi.org/10.1007/s00208-024-02809-9.

Insights and caveats from mining local and global temporal motifs in cryptocurrency transaction networks.
Arnold, N Zhong, P Ba, C Steer, B Mondragón, R Cuadrado, F Lambiotte, R Clegg, R CoRR volume abs/2402.09272 (01 Jan 2024)
Wasserstein distributional robustness of neural networks.
Bai, X He, G Jiang, Y Oblój, J NeurIPS (2023)
Thu, 02 May 2024

17:00 - 18:00
L3

Multi topological fields, approximations and NTP2

Silvain Rideau-Kikuchi
(École Normale Supérieure )
Abstract

(Joint work with S. Montenegro)

The striking resemblance between the behaviour of pseudo-algebraically closed, pseudo real closed and pseudo p-adically fields has lead to numerous attempts at describing their properties in a unified manner. In this talk I will present another of these attempts: the class of pseudo-T-closed fields, where T is an enriched theory of fields. These fields verify a « local-global » principle with respect to models of T for the existence of points on varieties. Although it very much resembles previous such attempts, our approach is more model theoretic in flavour, both in its presentation and in the results we aim for.

The first result I would like to present is an approximation result, generalising a result of Kollar on PAC fields, respectively Johnson on henselian fields. This result can be rephrased as the fact that existential closeness in certain topological enrichments come for free from existential closeness as a field. The second result is a (model theoretic) classification result for bounded pseudo-T-closed fields, in the guise of the computation of their burden. One of the striking consequence of these two results is that a bounded perfect PAC field with n independent valuations has burden n and, in particular, is NTP2.

Analytic Besov functional calculus for several commuting operators
Batty, C Gomilko, A Kobos, D Tomilov, Y Journal of Spectral Theory volume 14 issue 2 513-556 (30 May 2024)
Tue, 21 May 2024

14:00 - 15:00
L5

Spin link homology and webs in type B

Elijah Bodish
(MIT)
Abstract

In their study of GL(N)-GL(m) Howe duality, Cautis-Kamnitzer-Morrison observed that the GL(N) Reshetikhin-Turaev link invariant can be computed in terms of quantum gl(m). This idea inspired Cautis and Lauda-Queffelec-Rose to give a construction of GL(N) link homology in terms of Khovanov-Lauda's categorified quantum gl(m). There is a Spin(2n+1)-Spin(m) Howe duality, and a quantum analogue that was first studied by Wenzl. In the first half of the talk, I will explain how to use this duality to compute the Spin(2n+1) link polynomial, and present calculations which suggest that the Spin(2n+1) link invariant is obtained from the GL(2n) link invariant by folding. In the second part of the talk, I will introduce the parallel categorified constructions and explain how to use them to define Spin(2n+1) link homology.

This is based on joint work in progress with Ben Elias and David Rose.

Subscribe to