Mathematical Institute

Thu, 23 Nov 2023
14:00

Lecture Room 3

Making SGD parameter-free

Oliver Hinder

(University of Pittsburgh)

Abstract

We develop an algorithm for parameter-free stochastic convex optimization (SCO) whose rate of convergence is only a double-logarithmic factor larger than the optimal rate for the corresponding known-parameter setting. In contrast, the best previously known rates for parameter-free SCO are based on online parameter-free regret bounds, which contain unavoidable excess logarithmic terms compared to their known-parameter counterparts. Our algorithm is conceptually simple, has high-probability guarantees, and is also partially adaptive to unknown gradient norms, smoothness, and strong convexity. At the heart of our results is a novel parameter-free certificate for the step size of stochastic gradient descent (SGD), and a time-uniform concentration result that assumes no a-priori bounds on SGD iterates.

Additionally, we present theoretical and numerical results for a dynamic step size schedule for SGD based on a variant of this idea. On a broad range of vision and language transfer learning tasks our methods performance is close to that of SGD with tuned learning rate. Also, a per-layer variant of our algorithm approaches the performance of tuned ADAM.

This talk is based on papers with Yair Carmon and Maor Ivgi.

Thu, 09 Nov 2023
14:00

Rutherford Appleton Laboratory, nr Didcot

Numerical shape optimization: a bit of theory and a bit of practice

Alberto Paganini

(University of Leicester)

Further Information

Please note this seminar is held at Rutherford Appleton Laboratory (RAL)

Rutherford Appleton Laboratory
Harwell Campus
Didcot
OX11 0QX

How to get to RAL

Abstract

We use the term shape optimization when we want to find a minimizer of an objective function that assigns real values to shapes of domains. Solving shape optimization problems can be quite challenging, especially when the objective function is constrained to a PDE, in the sense that evaluating the objective function for a given domain shape requires first solving a boundary value problem stated on that domain. The main challenge here is that shape optimization methods must employ numerical methods capable of solving a boundary value problem on a domain that changes after each iteration of the optimization algorithm.

The first part of this talk will provide a gentle introduction to shape optimization. The second part of this talk will highlight how the finite element framework leads to automated numerical shape optimization methods, as realized in the open-source library fireshape. The talk will conclude with a brief overview of some academic and industrial applications of shape optimization.

Unitary Hecke algebra modules with nonzero Dirac cohomology

Barbasch, D Ciubotaru, D Symmetry: Representation Theory and Its Applications volume 257 1-20 (02 Dec 2014)

Thu, 20 Jul 2023
18:00

Lecture Theatre 1

The hat: an aperiodic monotile

Various

Further Information

The theory of tilings in the plane touches on diverse areas of mathematics, physics and beyond. Aperiodic sets of tiles, such as the famous Penrose tiling that you see as you walk into the Mathematical Institute, admit tilings of the plane without any translational symmetry. The Penrose tiling is made of two elementary shapes, or tiles, and mathematicians have long wondered about the existence of a single tile that could tile the plane aperiodically. Earlier this year such a shape was discovered: the hat! This hat turned out to be the first of a whole family, and is being celebrated across a two-day meeting in Oxford.

For this public talk, organised in partnership with the Clay Mathematics Institute, Chaim Goodman-Strauss (National Museum of Mathematics/University of Arkansas), one of the authors of this new work, will give an overview of the hat.

This will be followed by a panel discussion featuring Craig Kaplan (University of Waterloo), Marjorie Senechal (Smith College) and Roger Penrose (University of Oxford) as well as Chaim Goodman-Strauss. The discussion, about the impact of this new discovery and future directions will be chaired by Henna Koivusalo (University of Bristol).

Mon, 19 Jun 2023

14:00 - 15:00

Lecture Room 6

ScreeNOT: Optimal Singular Value Thresholding and Principal Component Selection in Correlated Noise

Elad Romanov

Abstract

Principal Component Analysis (PCA) is a fundamental and ubiquitous tool in statistics and data analysis.
The bare-bones idea is this. Given a data set of n points y_1, ..., y_n, form their sample covariance S. Eigenvectors corresponding to large eigenvalues--namely directions along which the variation within the data set is large--are usually thought of as "important" or "signal-bearing"; in contrast, weak directions are often interpreted as "noise", and discarded along the proceeding steps of the data analysis pipeline. Principal component (PC) selection is an important methodological question: how large should an eigenvalue be so as to be considered "informative"?
Our main deliverable is ScreeNOT: a novel, mathematically-grounded procedure for PC selection. It is intended as a fully algorithmic replacement for the heuristic and somewhat vaguely-defined procedures that practitioners often use--for example the popular "scree test".
Towards tackling PC selection systematically, we model the data matrix as a low-rank signal plus noise matrix Y = X + Z; accordingly, PC selection is cast as an estimation problem for the unknown low-rank signal matrix X, with the class of permissible estimators being singular value thresholding rules. We consider a formulation of the problem under the spiked model. This asymptotic setting captures some important qualitative features observed across numerous real-world data sets: most of the singular values of Y are arranged neatly in a "bulk", with very few large outlying singular values exceeding the bulk edge. We propose an adaptive algorithm that, given a data matrix, finds the optimal truncation threshold in a data-driven manner under essentially arbitrary noise conditions: we only require that Z has a compactly supported limiting spectral distribution--which may be a priori unknown. Under the spiked model, our algorithm is shown to have rather strong oracle optimality properties: not only does it attain the best error asymptotically, but it also achieves (w.h.p.) the best error--compared to all alternative thresholds--at finite n.

This is joint work with Matan Gavish (Hebrew University of Jerusalem) and David Donoho (Stanford).

Mon, 19 Jun 2023
13:00

Evaluating one-loop string amplitudes

Sebastian Mizera

(IAS)

Abstract

Scattering amplitudes in string theory are written as formal integrals of correlations functions over the moduli space of punctured Riemann surfaces. It's well-known, albeit not often emphasized, that this prescription is only approximately correct because of the ambiguities in defining the integration domain. In this talk, we propose a resolution of this problem for one-loop open-string amplitudes and present their first evaluation at finite energy and scattering angle. Our method involves a deformation of the integration contour over the modular parameter to a fractal contour introduced by Rademacher in the context of analytic number theory. This procedure leads to explicit and practical formulas for the one-loop planar and non-planar type-I superstring four-point amplitudes, amenable to numerical evaluation. We plot the amplitudes as a function of the Mandelstam invariants and directly verify long-standing conjectures about their behavior at high energies.

Mon, 12 Jun 2023
17:15

Evaluating one-loop string amplitudes

Sebastian Mizera

(IAS)

Abstract

Multilevel Monte Carlo Methods

Giles, M Monte Carlo and Quasi-Monte Carlo Methods 2012 volume 65 83-103 (08 Nov 2013)

On a class of nonlocal continuity equations on graphs

Esposito, A Patacchini, F Schlichting, A European Journal of Applied Mathematics volume 35 issue 1 109-126 (17 May 2023)

Inferring the Composition of a Trader Population in a Financial Market

Gupta, N Hauser, R Johnson, N Econophysics of Markets and Business Networks 99-113 (2007)

Subscribe to