Tue, 16 Jan 2024

14:00 - 15:00
L4

Heights of random trees

Louigi Addario-Berry
(McGill University)
Abstract

A rooted tree $T$ has degree sequence $(d_1,\ldots,d_n)$ if $T$ has vertex set $[n]$ and vertex $i$ has $d_i$ children for each $i$ in $[n]$. 

I will describe a line-breaking construction of random rooted trees with given degree sequences, as well as a way of coupling random trees with different degree sequences that also couples their heights to one another. 

The construction and the coupling have several consequences, and I'll try to explain some of these in the talk.

First, let $T$ be a branching process tree with criticalmean oneoffspring distribution, and let $T_n$ have the law of $T$ conditioned to have size $n$. Then the following both hold.
1) $\operatorname{height}(T_n)/\log(n)$ tends to infinity in probability. 
2) If the offspring distribution has infinite variance then $\operatorname{height}(T_n)/n^{1/2}$ tends to $0$ in probability. This result settles a conjecture of Svante Janson.

The next two statements relate to random rooted trees with given degree sequences. 
1) For any $\varepsilon > 0$ there is $C > 0$ such that the following holds. If $T$ is a random tree with degree sequence $(d_1,\ldots,d_n)$ and at least $\varepsilon n$ leaves, then $\mathbb{E}(\operatorname{height}(T)) < C \sqrt{n}$. 
2) Consider any random tree $T$ with a fixed degree sequence such that $T$ has no vertices with exactly one child. Then $\operatorname{height}(T)$ is stochastically less than $\operatorname{height}(B)$, where $B$ is a random binary tree of the same size as $T$ (or size one greater, if $T$ has even size). 

This is based on joint work with Serte Donderwinkel and Igor Kortchemski.

Extensional flow of a compressible viscous fluid
McPhail, M Oliver, J Parker, R Griffiths, I Journal of Fluid Mechanics volume 977 (22 Dec 2023)
Looking forwards and backwards: dynamics and genealogies of locally regulated populations
Etheridge, A Kurtz, T Letter, I Ralph, P Tsui, T Electronic Journal of Probability volume 29 1-85 (13 Feb 2024)
Subtle variation in sepsis-III definitions markedly influences predictive performance within and across methods
Cohen, S Foster, J Foster, P Lou, H Lyons, T Morley, S Morrill, J Ni, H Palmer, E Wang, B Wu, Y Yang, L Yang, W Scientific Reports volume 14 (22 Jan 2024)
Fri, 10 May 2024
16:00
L1

Talks on Talks

Abstract

What makes a good talk? This year, graduate students and postdocs will give a series talks on how to give talks! There may even be a small prize for the audience’s favourite.

If you’d like to have a go at informing, entertaining, or just have an axe to grind about a particularly bad talk you had to sit through, we’d love to hear from you (you can email Ric Wade or ask any of the organizers).
 

Quantum error mitigated classical shadows
Jnane, H Steinberg, J Cai, Z Nguyen, H Koczor, B PRX Quantum volume 5 issue 1 (09 Feb 2024)
Tue, 05 Mar 2024

14:30 - 15:00
L6

Error Bound on Singular Values Approximations by Generalized Nystrom

Lorenzo Lazzarino
(Mathematical Institute (University of Oxford))
Abstract

We consider the problem of approximating singular values of a matrix when provided with approximations to the leading singular vectors. In particular, we focus on the Generalized Nystrom (GN) method, a commonly used low-rank approximation, and its error in extracting singular values. Like other approaches, the GN approximation can be interpreted as a perturbation of the original matrix. Up to orthogonal transformations, this perturbation has a peculiar structure that we wish to exploit. Thus, we use the Jordan-Wieldant Theorem and similarity transformations to generalize a matrix perturbation theory result on eigenvalues of a perturbed Hermitian matrix. Finally, combining the above,  we can derive a bound on the GN singular values approximation error. We conclude by performing preliminary numerical examples. The aim is to heuristically study the sharpness of the bound, to give intuitions on how the analysis can be used to compare different approaches, and to provide ideas on how to make the bound computable in practice.

Tue, 20 Feb 2024

14:30 - 15:00
L6

CMA Light: A novel Minibatch Algorithm for large-scale non convex finite sum optimization

Corrado Coppola
(Sapienza University of Rome)
Abstract
The supervised training of a deep neural network on a given dataset consists of the unconstrained minimization of the finite sum of continuously differentiable functions, commonly referred to as loss with respect to the samples. These functions depend on the network parameters and most of the times are non-convex.  We develop CMA Light, a new globally convergent mini-batch gradient method to tackle this problem. We consider the recently introduced Controlled Minibatch Algorithm (CMA) framework and we overcome its main bottleneck, removing the need for at least one evaluation of the whole objective function per iteration. We prove global convergence of CMA Light under mild assumptions and we discuss extensive computational results on the same experimental test bed used for CMA, showing that CMA Light requires less computational effort than most of the state-of-the-art optimizers. Eventually, we present early results on a large-scale Image Classification task.
 
The reference pre-print is already on arXiv at https://arxiv.org/abs/2307.15775
Tue, 20 Feb 2024

14:00 - 14:30
L6

Tensor Methods for Nonconvex Optimization using Cubic-quartic regularization models

Wenqi Zhu
(Mathematical Institute (University of Oxford))
Abstract

High-order tensor methods for solving both convex and nonconvex optimization problems have recently generated significant research interest, due in part to the natural way in which higher derivatives can be incorporated into adaptive regularization frameworks, leading to algorithms with optimal global rates of convergence and local rates that are faster than Newton's method. On each iteration, to find the next solution approximation, these methods require the unconstrained local minimization of a (potentially nonconvex) multivariate polynomial of degree higher than two, constructed using third-order (or higher) derivative information, and regularized by an appropriate power of the change in the iterates. Developing efficient techniques for the solution of such subproblems is currently, an ongoing topic of research,  and this talk addresses this question for the case of the third-order tensor subproblem.


In particular, we propose the CQR algorithmic framework, for minimizing a nonconvex Cubic multivariate polynomial with  Quartic Regularisation, by sequentially minimizing a sequence of local quadratic models that also incorporate both simple cubic and quartic terms. The role of the cubic term is to crudely approximate local tensor information, while the quartic one provides model regularization and controls progress. We provide necessary and sufficient optimality conditions that fully characterise the global minimizers of these cubic-quartic models. We then turn these conditions into secular equations that can be solved using nonlinear eigenvalue techniques. We show, using our optimality characterisations, that a CQR algorithmic variant has the optimal-order evaluation complexity of $O(\epsilon^{-3/2})$ when applied to minimizing our quartically-regularised cubic subproblem, which can be further improved in special cases.  We propose practical CQR variants that judiciously use local tensor information to construct the local cubic-quartic models. We test these variants numerically and observe them to be competitive with ARC and other subproblem solvers on typical instances and even superior on ill-conditioned subproblems with special structure.

Subscribe to