Forthcoming events in this series


Mon, 13 Jun 2022

14:00 - 15:00
L4

Highly accurate protein structure prediction with AlphaFold

Jonas Adler
(Google)
Abstract

Predicting a protein’s structure from its primary sequence has been a grand challenge in biology for the past 50 years, holding the promise to bridge the gap between the pace of genomics discovery and resulting structural characterization. In this talk, we will describe work at DeepMind to develop AlphaFold, a new deep learning-based system for structure prediction that achieves high accuracy across a wide range of targets. We demonstrated our system in the 14th biennial Critical Assessment of Protein Structure Prediction (CASP14) across a wide range of difficult targets, where the assessors judged our predictions to be at an accuracy “competitive with experiment” for approximately 2/3rds of proteins. The talk will cover both the underlying machine learning ideas and the implications for biological research as well as some promising further work.

Mon, 06 Jun 2022

14:00 - 15:00
Virtual

Geometry of Molecular Conformations in Cryo-EM

Roy Lederman
(Yale University )
Abstract

Cryo-Electron Microscopy (cryo-EM) is an imaging technology that is revolutionizing structural biology. Cryo-electron microscopes produce many very noisy two-dimensional projection images of individual frozen molecules; unlike related methods, such as computed tomography (CT), the viewing direction of each particle image is unknown. The unknown directions and extreme noise make the determination of the structure of molecules challenging. While other methods for structure determination, such as x-ray crystallography and NMR, measure ensembles of molecules, cryo-electron microscopes produce images of individual particles. Therefore, cryo-EM could potentially be used to study mixtures of conformations of molecules. We will discuss a range of recent methods for analyzing the geometry of molecular conformations using cryo-EM data.

Mon, 30 May 2022

15:00 - 16:00
Virtual

Geometry of memoryless policy optimization in POMDPs

Guido Montufar
(UCLA )
Abstract

We consider the problem of finding the best memoryless stochastic policy for an infinite-horizon partially observable Markov decision process (POMDP) with finite state and action spaces with respect to either the discounted or mean reward criterion. We show that the (discounted) state-action frequencies and the expected cumulative reward are rational functions of the policy, whereby the degree is determined by the degree of partial observability. We then describe the optimization problem as a linear optimization problem in the space of feasible state-action frequencies subject to polynomial constraints that we characterize explicitly. This allows us to address the combinatorial and geometric complexity of the optimization problem using tools from polynomial optimization. In particular, we estimate the number of critical points and use the polynomial programming description of reward maximization to solve a navigation problem in a grid world. The talk is based on recent work with Johannes Müller.

Mon, 16 May 2022

14:00 - 15:00
Virtual

Smooth over-parametrized solvers for non-smooth structured optimisation

Clarice Poon
(University of Bath)
Abstract

Non-smooth optimization is a core ingredient of many imaging or machine learning pipelines. Non-smoothness encodes structural constraints on the solutions, such as sparsity, group sparsity, low-rank and sharp edges. It is also the basis for the definition of robust loss functions such as the square-root lasso.  Standard approaches to deal with non-smoothness leverage either proximal splitting or coordinate descent. The effectiveness of their usage typically depend on proper parameter tuning, preconditioning or some sort of support pruning. In this work, we advocate and study a different route. By over-parameterization and marginalising on certain variables (Variable Projection), we show how many popular non-smooth structured problems can be written as smooth optimization problems. The result is that one can then take advantage of quasi-Newton solvers such as L-BFGS and this, in practice, can lead to substantial performance gains. Another interesting aspect of our proposed solver is its efficiency when handling imaging problems that arise from fine discretizations (unlike proximal methods such as ISTA whose convergence is known to have exponential dependency on dimension). On a theoretical level, one can connect gradient descent on our over-parameterized formulation with mirror descent with a varying Hessian metric. This observation can then be used to derive dimension free convergence bounds and explains the efficiency of our method in the fine-grids regime.

Mon, 07 Mar 2022

14:00 - 15:00
Virtual

Towards practical estimation of Brenier maps

Jonathan Niles-Weed
(New York University)
Abstract

Given two probability distributions in R^d, a transport map is a function which maps samples from one distribution into samples from the other. For absolutely continuous measures, Brenier proved a remarkable theorem identifying a unique canonical transport map, which is "monotone" in a suitable sense. We study the question of whether this map can be efficiently estimated from samples. The minimax rates for this problem were recently established by Hutter and Rigollet (2021), but the estimator they propose is computationally infeasible in dimensions greater than three. We propose two new estimators---one minimax optimal, one not---which are significantly more practical to compute and implement. The analysis of these estimators is based on new stability results for the optimal transport problem and its regularized variants. Based on joint work with Manole, Balakrishnan, and Wasserman and with Pooladian.

Mon, 21 Feb 2022

14:00 - 15:00
Virtual

Why things don’t work — On the extended Smale's 9th and 18th problems (the limits of AI) and methodological barriers

Anders Hansen
(University of Cambridge)
Abstract

The alchemists wanted to create gold, Hilbert wanted an algorithm to solve Diophantine equations, researchers want to make deep learning robust in AI, MATLAB wants (but fails) to detect when it provides wrong solutions to linear programs etc. Why does one not succeed in so many of these fundamental cases? The reason is typically methodological barriers. The history of  science is full of methodological barriers — reasons for why we never succeed in reaching certain goals. In many cases, this is due to the foundations of mathematics. We will present a new program on methodological barriers and foundations of mathematics,  where — in this talk — we will focus on two basic problems: (1) The instability problem in deep learning: Why do researchers fail to produce stable neural networks in basic classification and computer vision problems that can easily be handled by humans — when one can prove that there exist stable and accurate neural networks? Moreover, AI algorithms can typically not detect when they are wrong, which becomes a serious issue when striving to create trustworthy AI. The problem is more general, as for example MATLAB's linprog routine is incapable of certifying correct solutions of basic linear programs. Thus, we’ll address the following question: (2) Why are algorithms (in AI and computations in general) incapable of determining when they are wrong? These questions are deeply connected to the extended Smale’s 9th and 18th problems on the list of mathematical problems for the 21st century. 

Mon, 14 Feb 2022

14:00 - 15:00
Virtual

The convex geometry of blind deconvolution

Felix Krahmer
(Technical University of Munich)
Abstract

Blind deconvolution problems are ubiquitous in many areas of imaging and technology and have been the object of study for several decades. Recently, motivated by the theory of compressed sensing, a new viewpoint has been introduced, motivated by applications in wireless application, where a signal is transmitted through an unknown channel. Namely, the idea is to randomly embed the signal into a higher dimensional space before transmission. Due to the resulting redundancy, one can hope to recover both the signal and the channel parameters. In this talk we analyze convex approaches based on lifting as they have first been studied by Ahmed et al. (2014). We show that one encounters a fundamentally different geometric behavior as compared to generic bilinear measurements. Namely, for very small levels of deterministic noise, the error bounds based on common paradigms no longer scale linearly in the noise level, but one encounters dimensional constants or a sublinear scaling. For larger - arguably more realistic - noise levels, in contrast, the scaling is again near-linear.

This is joint work with Yulia Kostina (TUM) and Dominik Stöger (KU Eichstätt-Ingolstadt).

Mon, 24 Jan 2022

14:00 - 15:00
Virtual

Exploiting low dimensional data structures in volumetric X-ray imaging

Thomas Blumensath
(University of Southampton)
Abstract

Volumetric X-ray tomography is used in many areas, including applications in medical imaging, many fields of scientific investigation as well as several industrial settings. Yet complex X-ray physics and the significant size of individual x-ray tomography data-sets poses a range of data-science challenges from the development of efficient computational methods, the modelling of complex non-linear relationships, the effective analysis of large volumetric images as well as the inversion of several ill conditioned inverse problems, all of which prevent the application of these techniques in many advanced imaging settings of interest. This talk will highlight several applications were specific data-science issues arise and showcase a range of approaches developed recently at the University of Southampton to overcome many of these obstacles.

Mon, 29 Nov 2021

14:00 - 15:00

Parameter Estimation for the McKean-Vlasov Stochastic Differential Equation

Nikolas Kantas
(Imperial College London)
Abstract

We consider the problem of parameter estimation for a McKean stochastic differential equation, and the associated system of weakly interacting particles. The problem is motivated by many applications in areas such as neuroscience, social sciences (opinion dynamics, cooperative behaviours), financial mathematics, statistical physics. We will first survey some model properties related to propagation of chaos and ergodicity and then move on to discuss the problem of parameter estimation both in offline and on-line settings. In the on-line case, we propose an online estimator, which evolves according to a continuous-time stochastic gradient descent algorithm on the asymptotic log-likelihood of the interacting particle system. The talk will present our convergence results and then show some numerical results for two examples, a linear mean field model and a stochastic opinion dynamics model. This is joint work with Louis Sharrock, Panos Parpas and Greg Pavliotis. Preprint: https://arxiv.org/abs/2106.13751

Mon, 22 Nov 2021

14:00 - 15:00
Virtual

On the Convergence of Langevin Monte Carlo: The Interplay between Tail Growth and Smoothness

Murat Erdogdu
(University of Toronto)
Abstract

We study sampling from a target distribution $e^{-f}$ using the unadjusted Langevin Monte Carlo (LMC) algorithm. For any potential function $f$ whose tails behave like $\|x\|^\alpha$ for $\alpha \in [1,2]$, and has $\beta$-H\"older continuous gradient, we derive the sufficient number of steps to reach the $\epsilon$-neighborhood of a $d$-dimensional target distribution as a function of $\alpha$ and $\beta$. Our rate estimate, in terms of $\epsilon$ dependency, is not directly influenced by the tail growth rate $\alpha$ of the potential function as long as its growth is at least linear, and it only relies on the order of smoothness $\beta$.

Our rate recovers the best known rate which was established for strongly convex potentials with Lipschitz gradient in terms of $\epsilon$ dependency, but we show that the same rate is achievable for a wider class of potentials that are degenerately convex at infinity.

Mon, 08 Nov 2021

14:00 - 15:00
Virtual

STRUCTURED (IN) FEASIBILITY: NONMONOTONE OPERATOR SPLITTING IN NONLINEAR SPACES

Russell Luke
(University of Göttingen)
Abstract

The success of operator splitting techniques for convex optimization has led to an explosion of methods for solving large-scale and non convex optimization problems via convex relaxation. 

This success is at the cost of overlooking direct approaches to operator splitting that embrace some of the more inconvenient aspects of many model problems, namely nonconvexity, non smoothness and infeasibility.  I will introduce some of the tools we have developed for handling these issues, and present sketches of the basic results we can obtain.

The formalism is in general metric spaces, but most applications have their basis in Euclidean spaces.  Along the way I will try to point out connections to other areas of intense interest, such as optimal mass transport.

Thu, 14 Oct 2021

14:00 - 15:00
Virtual

What is the role of a neuron?

David Bau
(MIT)
Abstract

One of the great challenges of neural networks is to understand how they work.  For example: does a neuron encode a meaningful signal on its own?  Or is a neuron simply an undistinguished and arbitrary component of a feature vector space?  The tension between the neuron doctrine and the population coding hypothesis is one of the classical debates in neuroscience. It is a difficult debate to settle without an ability to monitor every individual neuron in the brain.

 

Within artificial neural networks we can examine every neuron. Beginning with the simple proposal that an individual neuron might represent one internal concept, we conduct studies relating deep network neurons to human-understandable concepts in a concrete, quantitative way: Which neurons? Which concepts? Are neurons more meaningful than an arbitrary feature basis? Do neurons play a causal role? We examine both simplified settings and state-of-the-art networks in which neurons learn how to represent meaningful objects within the data without explicit supervision.

 

Following this inquiry in computer vision leads us to insights about the computational structure of practical deep networks that enable several new applications, including semantic manipulation of objects in an image; understanding of the sparse logic of a classifier; and quick, selective editing of generalizable rules within a fully trained generative network.  It also presents an unanswered mathematical question: why is such disentanglement so pervasive?

 

In the talk, we challenge the notion that the internal calculations of a neural network must be hopelessly opaque. Instead, we propose to tear back the curtain and chart a path through the detailed structure of a deep network by which we can begin to understand its logic.

 

Fri, 11 Jun 2021

14:00 - 15:00

Geometric Methods for Machine Learning and Optimization

Melanie Weber
(Princeton)
Abstract

Many machine learning applications involve non-Euclidean data, such as graphs, strings or matrices. In such cases, exploiting Riemannian geometry can deliver algorithms that are computationally superior to standard(Euclidean) nonlinear programming approaches. This observation has resulted in an increasing interest in Riemannian methods in the optimization and machine learning community.

In the first part of the talk, we consider the task of learning a robust classifier in hyperbolic space. Such spaces have received a surge of interest for representing large-scale, hierarchical data, due to the fact that theyachieve better representation accuracy with fewer dimensions. We present the first theoretical guarantees for the (robust) large margin learning problem in hyperbolic space and discuss conditions under which hyperbolic methods are guaranteed to surpass the performance of their Euclidean counterparts. In the second part, we introduce Riemannian Frank-Wolfe (RFW) methods for constrained optimization on manifolds. Here, we discuss matrix-valued tasks for which such Riemannian methods are more efficient than classical Euclidean approaches. In particular, we consider applications of RFW to the computation of Riemannian centroids and Wasserstein barycenters, both of which are crucial subroutines in many machine learning methods.

Fri, 04 Jun 2021

12:00 - 13:00

Fast Symmetric Tensor Decomposition

Joe Kileel
(UT Austin)
Abstract

From latent variable models in machine learning to inverse problems in computational imaging, tensors pervade the data sciences.  Often, the goal is to decompose a tensor into a particular low-rank representation, thereby recovering quantities of interest about the application at hand.  In this talk, I will present a recent method for low-rank CP symmetric tensor decomposition.  The key ingredients are Sylvester’s catalecticant method from classical algebraic geometry and the power method from numerical multilinear algebra.  In simulations, the method is roughly one order of magnitude faster than existing CP decomposition algorithms, with similar accuracy.  I will state guarantees for the relevant non-convex optimization problem, and robustness results when the tensor is only approximately low-rank (assuming an appropriate random model).  Finally, if the tensor being decomposed is a higher-order moment of data points (as in multivariate statistics), our method may be performed without explicitly forming the moment tensor, opening the door to high-dimensional decompositions.  This talk is based on joint works with João Pereira, Timo Klock and Tammy Kolda. 

Fri, 28 May 2021

12:00 - 13:00

Invariants for persistent homology and their stability

Nina Otter
(UCLA)
Abstract

One of the most successful methods in topological data analysis (TDA) is persistent homology, which associates a one-parameter family of spaces to a data set, and gives a summary — an invariant called "barcode" — of how topological features, such as the number of components, holes, or voids evolve across the parameter space. In many applications one might wish to associate a multiparameter family of spaces to a data set. There is no generalisation of the barcode to the multiparameter case, and finding algebraic invariants that are suitable for applications is one of the biggest challenges in TDA.

The use of persistent homology in applications is justified by the validity of certain stability results. At the core of such results is a notion of distance between the invariants that one associates to data sets. While such distances are well-understood in the one-parameter case, the study of distances for multiparameter persistence modules is more challenging, as they rely on a choice of suitable invariant.

In this talk I will first give a brief introduction to multiparameter persistent homology. I will then present a general framework to study stability questions in multiparameter persistence: I will discuss which properties we would like invariants to satisfy, present different ways to associate distances to such invariants, and finally illustrate how this framework can be used to derive new stability results. No prior knowledge on the subject is assumed.

The talk is based on joint work with Barbara Giunti, John Nolan and Lukas Waas. 

Fri, 12 Mar 2021

12:00 - 13:00

The Metric is All You Need (for Disentangling)

David Pfau
(DeepMind)
Abstract

Learning a representation from data that disentangles different factors of variation is hypothesized to be a critical ingredient for unsupervised learning. Defining disentangling is challenging - a "symmetry-based" definition was provided by Higgins et al. (2018), but no prescription was given for how to learn such a representation. We present a novel nonparametric algorithm, the Geometric Manifold Component Estimator (GEOMANCER), which partially answers the question of how to implement symmetry-based disentangling. We show that fully unsupervised factorization of a data manifold is possible if the true metric of the manifold is known and each factor manifold has nontrivial holonomy – for example, rotation in 3D. Our algorithm works by estimating the subspaces that are invariant under random walk diffusion, giving an approximation to the de Rham decomposition from differential geometry. We demonstrate the efficacy of GEOMANCER on several complex synthetic manifolds. Our work reduces the question of whether unsupervised disentangling is possible to the question of whether unsupervised metric learning is possible, providing a unifying insight into the geometric nature of representation learning.

 

Fri, 05 Mar 2021

12:00 - 13:00

Linear convergence of an alternating polar decomposition method for low rank orthogonal tensor approximations

Ke Ye
(Chinese Academy of Sciences)
Abstract

Low rank orthogonal tensor approximation (LROTA) is an important problem in tensor computations and their applications. A classical and widely used algorithm is the alternating polar decomposition method (APD). In this talk, I will first give very a brief introduction to tensors and their decompositions. After that, an improved version named iAPD of the classical APD will be proposed and all the following four fundamental properties of iAPD will be discussed : (i) the algorithm converges globally and the whole sequence converges to a KKT point without any assumption; (ii) it exhibits an overall sublinear convergence with an explicit rate which is sharper than the usual O(1/k) for first order methods in optimization; (iii) more importantly, it converges R-linearly for a generic tensor without any assumption; (iv) for almost all LROTA problems, iAPD reduces to APD after finitely many iterations if it converges to a local minimizer. If time permits, I will also present some numerical experiments.

Fri, 26 Feb 2021

12:00 - 13:00

The magnitude of point-cloud data (cancelled)

Nina Otter
(UCLA)
Abstract

Magnitude is an isometric invariant of metric spaces that was introduced by Tom Leinster in 2010, and is currently the object of intense research, since it has been shown to encode many invariants of a metric space such as volume, dimension, and capacity.

Magnitude homology is a homology theory for metric spaces that has been introduced by Hepworth-Willerton and Leinster-Shulman, and categorifies magnitude in a similar way as the singular homology of a topological space categorifies its Euler characteristic.

In this talk I will first introduce magnitude and magnitude homology. I will then give an overview of existing results and current research in this area, explain how magnitude homology is related to persistent homology, and finally discuss new stability results for magnitude and how it can be used to study point cloud data.

This talk is based on  joint work in progress with Miguel O’Malley and Sara Kalisnik, as well as the preprint https://arxiv.org/abs/1807.01540.

Fri, 19 Feb 2021

12:00 - 13:00

The Unlimited Sampling Approach to Computational Sensing and Imaging

Ayush Bhandari
((Imperial College, London))
Abstract

Digital data capture is the backbone of all modern day systems and “Digital Revolution” has been aptly termed as the Third Industrial Revolution. Underpinning the digital representation is the Shannon-Nyquist sampling theorem and more recent developments include compressive sensing approaches. The fact that there is a physical limit to which sensors can measure amplitudes poses a fundamental bottleneck when it comes to leveraging the performance guaranteed by recovery algorithms. In practice, whenever a physical signal exceeds the maximum recordable range, the sensor saturates, resulting in permanent information loss. Examples include (a) dosimeter saturation during the Chernobyl reactor accident, reporting radiation levels far lower than the true value and (b) loss of visual cues in self-driving cars coming out of a tunnel (due to sudden exposure to light). 

 

To reconcile this gap between theory and practice, we introduce the Unlimited Sensing framework or the USF that is based on a co-design of hardware and algorithms. On the hardware front, our work is based on a radically different analog-to-digital converter (ADC) design, which allows for the ADCs to produce modulo or folded samples. On the algorithms front, we develop new, mathematically guaranteed recovery strategies.  

 

In the first part of this talk, we prove a sampling theorem akin to the Shannon-Nyquist criterion. We show that, remarkably, despite the non-linearity in sensing pipeline, the sampling rate only depends on the signal’s bandwidth. Our theory is complemented with a stable recovery algorithm. Beyond the theoretical results, we will also present a hardware demo that shows our approach in action.

 

Moving further, we reinterpret the unlimited sensing framework as a generalized linear model that motivates a new class of inverse problems. We conclude this talk by presenting new results in the context of single-shot high-dynamic-range (HDR) imaging, sensor array processing and HDR tomography based on the modulo Radon transform.

Fri, 20 Nov 2020

12:00 - 13:00

SELECTION DYNAMICS FOR DEEP NEURAL NETWORKS

Peter Markowich
(KAUST)
Abstract

We present a partial differential equation framework for deep residual neural networks and for the associated learning problem. This is done by carrying out the continuum limits of neural networks with respect to width and depth. We study the wellposedness, the large time solution behavior, and the characterization of the steady states of the forward problem. Several useful time-uniform estimates and stability/instability conditions are presented. We state and prove optimality conditions for the inverse deep learning problem, using standard variational calculus, the Hamilton-Jacobi-Bellmann equation and the Pontryagin maximum principle. This serves to establish a mathematical foundation for investigating the algorithmic and theoretical connections between neural networks, PDE theory, variational analysis, optimal control, and deep learning.

This is based on joint work with Hailiang Liu.

Fri, 13 Nov 2020

12:00 - 13:00

Computational Hardness of Hypothesis Testing and Quiet Plantings

Afonso Bandeira
(ETH Zurich)
Abstract

When faced with a data analysis, learning, or statistical inference problem, the amount and quality of data available fundamentally determines whether such tasks can be performed with certain levels of accuracy. With the growing size of datasets however, it is crucial not only that the underlying statistical task is possible, but also that is doable by means of efficient algorithms. In this talk we will discuss methods aiming to establish limits of when statistical tasks are possible with computationally efficient methods or when there is a fundamental «Statistical-to-Computational gap›› in which an inference task is statistically possible but inherently computationally hard. We will focus on Hypothesis Testing and the ``Low Degree Method'' and also address hardness of certification via ``quiet plantings''. Guiding examples will include Sparse PCA, bounds on the Sherrington Kirkpatrick Hamiltonian, and lower bounds on Chromatic Numbers of random graphs.

Fri, 06 Nov 2020

12:00 - 13:00

Bridging GANs and Stochastic Analysis

Haoyang Cao
(Alan Turing Institute)
Abstract

Generative adversarial networks (GANs) have enjoyed tremendous success in image generation and processing, and have recently attracted growing interests in other fields of applications. In this talk we will start from analyzing the connection between GANs and mean field games (MFGs) as well as optimal transport (OT). We will first show a conceptual connection between GANs and MFGs: MFGs have the structure of GANs, and GANs are MFGs under the Pareto Optimality criterion. Interpreting MFGs as GANs, on one hand, will enable a GANs-based algorithm (MFGANs) to solve MFGs: one neural network (NN) for the backward Hamilton-Jacobi-Bellman (HJB) equation and one NN for the Fokker-Planck (FP) equation, with the two NNs trained in an adversarial way. Viewing GANs as MFGs, on the other hand, will reveal a new and probabilistic aspect of GANs. This new perspective, moreover, will lead to an analytical connection between GANs and Optimal Transport (OT) problems, and sufficient conditions for the minimax games of GANs to be reformulated in the framework of OT. Building up from the probabilistic views of GANs, we will then establish the approximation of GANs training via stochastic differential equations and demonstrate the convergence of GANs training via invariant measures of SDEs under proper conditions. This stochastic analysis for GANs training can serve as an analytical tool to study its evolution and stability.

 
Fri, 30 Oct 2020

12:00 - 13:00

Neural differential equations in machine learning

Patrick Kidger
(Oxford Mathematics)
Abstract

Differential equations and neural networks are two of the most widespread modelling paradigms. I will talk about how to combine the best of both worlds through neural differential equations. These treat differential equations as a learnt component of a differentiable computation graph, and as such integrates tightly with current machine learning practice. Applications are widespread. I will begin with an introduction to the theory of neural ordinary differential equations, which may for example be used to model unknown physics. I will then move on to discussing recent work on neural controlled differential equations, which are state-of-the-art models for (arbitrarily irregular) time series. Next will be some discussion of neural stochastic differential equations: we will see that the mathematics of SDEs is precisely aligned with the machine learning of GANs, and thus NSDEs may be used as generative models. If time allows I will then discuss other recent work, such as how the training of neural differential equations may be sped up by ~40% by tweaking standard numerical solvers to respect the particular nature of the differential equations. This is joint work with Ricky T. Q. Chen, Xuechen Li, James Foster, and James Morrill.

Fri, 16 Oct 2020

12:00 - 13:00

Advances in Topology-Based Graph Classification

Bastian Rieck
(ETH Zurich)
Abstract

Topological data analysis has proven to be an effective tool in machine learning, supporting the analysis of neural networks, but also driving the development of new algorithms that make use of topological features. Graph classification is of particular interest here, since graphs are inherently amenable to a topological description in terms of their connected components and cycles. This talk will briefly summarise recent advances in topology-based graph classification, focussing equally on ’shallow’ and ‘deep’ approaches. Starting from an intuitive description of persistent homology, we will discuss how to incorporate topological features into the Weisfeiler–Lehman colour refinement scheme, thus obtaining a simple feature-based graph classification algorithm. We will then build a bridge to graph neural networks and demonstrate a topological variant of ‘readout’ functions, which can be learned in an end-to-end fashion. Care has been taken to make the talk accessible to an audience that might not have been exposed to machine learning or topological data analysis.