Forthcoming events in this series


Fri, 10 Mar 2023

15:00 - 16:00
Lecture Room 4

Mapper--type algorithms for complex data and relations

Radmila Sazdanovic
Abstract

Mapper and Ball Mapper are Topological Data Analysis tools used for exploring high dimensional point clouds and visualizing scalar–valued functions on those point clouds. Inspired by open questions in knot theory, new features are added to Ball Mapper that enable encoding of the structure, internal relations and symmetries of the point cloud. Moreover, the strengths of Mapper and Ball Mapper constructions are combined to create a tool for comparing high dimensional data descriptors of a single dataset. This new hybrid algorithm, Mapper on Ball Mapper, is applicable to high dimensional lens functions. As a proof of concept we include applications to knot and game  theory, as well as material science and cancer research. 

Fri, 03 Mar 2023

16:00 - 17:00
Lecture Room 6

Topological Optimization with Big Steps

Dmitry Morozov
Abstract

Using persistent homology to guide optimization has emerged as a novel application of topological data analysis. Existing methods treat persistence calculation as a black box and backpropagate gradients only onto the simplices involved in particular pairs. We show how the cycles and chains used in the persistence calculation can be used to prescribe gradients to larger subsets of the domain. In particular, we show that in a special case, which serves as a building block for general losses, the problem can be solved exactly in linear time. We present empirical experiments that show the practical benefits of our algorithm: the number of steps required for the optimization is reduced by an order of magnitude. (Joint work with Arnur Nigmetov.)

Fri, 24 Feb 2023

15:00 - 16:00
Lecture Room 4

Analysing the shape of 3-periodic scalar fields for diffusion modelling

Senja Barthel
Abstract

Simulating diffusion computationally allows to predict the diffusivity of materials, understand diffusion mechanisms, and to tailor-make materials such as solid-state electrolytes with desired properties aiming at developing new batteries. By studying the geometry and topology of 3-periodic scalar fields (e.g. the potential of ions in the electrolyte), we develop a cost-efficient multi-scale model for diffusion in crystalline materials. This project is a typical example of a collaboration in the overlap of topology and materials science that started as a persistent homology project and turned into something else.

Fri, 17 Feb 2023

15:00 - 16:00
Lecture Room 4

Mobius Inversions and Persistent Homology

Amit Patel
Abstract

There are several ways of defining the persistence diagram, but the definition using the Möbius inversion formula (for posets) offers the greatest amount of flexibility. There are now many variations of the so called Generalized Persistence Diagrams by many people.  In this talk, I will focus on the approach I am developing. I will cover the state-of-the-art and where I see this work going.

Fri, 27 Jan 2023
15:00
L2

TDA Centre Meeting

Various Speakers
(Mathematical Institute (University of Oxford))
Fri, 20 Jan 2023
15:00
L4

Applied Topology TBC

Michael Robinson
(American University)
Further Information

I am an applied mathematician working as an associate professor at American University. I am interested in signal processing, dynamics, and applications of topology.

Fri, 02 Dec 2022

15:00 - 16:00
L6

On the Discrete Geometric Principles of Machine Learning and Statistical Inference

Jesús A. De Loera
(UC Davies)
Further Information

You can find out more about Professor De Loera here: https://www.math.ucdavis.edu/~deloera/ 

Abstract

In this talk I explain the fertile relationship between the foundations of inference and learning and combinatorial geometry.

My presentation contains several powerful examples where famous theorems in discrete geometry answered natural  questions from machine learning and statistical inference:

In this tasting tour I will include the problem of deciding the existence of Maximum likelihood estimator in multiclass logistic regression, the variability of behavior of k-means algorithms with distinct random initializations and the shapes of the clusters, and the estimation of the number of samples in chance-constrained optimization models. These obviously only scratch the surface of what one could do with extra free time. Along the way we will see fascinating connections to the coupon collector problem, topological data analysis, measures of separability of data, and to the computation of Tukey centerpoints of data clouds (a high-dimensional generalization of median). All new theorems are joint work with subsets of the following wonderful folks: T. Hogan, D. Oliveros, E. Jaramillo-Rodriguez, and A. Torres-Hernandez.

Two relevant papers published/ to appear are

https://arxiv.org/abs/1907.09698https://arxiv.org/abs/1907.09698

https://arxiv.org/abs/2205.05743https://arxiv.org/abs/2205.05743

Fri, 25 Nov 2022

15:00 - 16:00
L5

Signal processing on cell complexes using discrete Morse theory

Celia Hacker
(EPFL)
Further Information

Celia is a PhD student under the supervision of Kathryn Hess since 2018.

Abstract

At the intersection of Topological Data Analysis and machine learning, the field of cellular signal processing has advanced rapidly in recent years. In this context, each signal on the cells of a complex is processed using the combinatorial Laplacian and the resulting Hodge decomposition. Meanwhile, discrete Morse theory has been widely used to speed up computations by reducing the size of complexes while preserving their global topological properties. In this talk, we introduce an approach to signal compression and reconstruction on complexes that leverages the tools of discrete Morse theory. The main goal is to reduce and reconstruct a cell complex together with a set of signals on its cells while preserving their global topological structure as much as possible. This is joint work with Stefania Ebli and Kelly Maggs.

Fri, 18 Nov 2022

15:00 - 16:00
L5

Tensor-based frameworks for cancer genomics

Neriman Tokcan
(MIT & Harvard)
Further Information

(taken from https://nerimantokcan.com/)

Neriman Tokcan's research focuses on formulating novel, mathematically sound theoretical frameworks to perform analysis of multi-modal, multi-dimensional data while preserving the integrity of their structure. Her work on the generalization of matrix-based compression, noise elimination, and dimension reduction methods to higher dimensions. Her background is at the intersection of algebraic geometry, multi-linear algebra, combinatorics, and representation theory. I explore applications in bioinformatics and cancer genomics.

Currently, Neriman is working on the formulation of the novel, mathematically sound tensor-based frameworks, and the development of computational tools to model tumor microenvironments.

Neriman will join the University of Massachusetts Boston as a Tenure-Track Assistant Professor of Applied Mathematics in January 2023.

Abstract

The tumor microenvironment (TME) is a complex milieu around the tumor, whereby cancer cells interact with stromal, immune, vascular, and extracellular components. The TME is being increasingly recognized as a key determinant of tumor growth, disease progression, and response to therapies. We build a generalizable and robust tensor-based framework capable of integrating dissociated single-cell and spatially resolved RNA-seq data for a comprehensive analysis of the TME. Tensors are a generalization of matrices to higher dimensions. Tensor methods are known to be able to successfully incorporate data from multiple sources and perform a joint analysis of heterogeneous high-dimensional data sets. The methodologies developed as part of this effort will advance our understanding of the TME in multiple directions. These include cellular heterogeneity within the TME, crosstalks between cells, and tumor-intrinsic pathways stimulating tumor growth and immune evasion.

Fri, 11 Nov 2022

12:00 - 15:45
L2

Centre for Topological Data Analysis Centre Meeting

Adam Brown, Heather Harrington, Živa Urbančič, David Beers.
(University of Oxford, Mathematical Institute)
Further Information

Details of speakers and schedule will be posted here nearer the time. 

Abstract

Here is the program.

Fri, 04 Nov 2022

15:00 - 16:00
L5

Dynamics of neural circuits at different scales

Jānis Lazovskis
(RTU Riga Business School)
Further Information

Jānis Lazovskis is an Assistant Professor at RTU Riga Business School in Riga, Latvia, working in algebraic topology and topological data analysis, in particular dynamic data. His research focuses on the intersection of topology and neuroscience, simplifying and classifying in silico activity with graph theoretic and topological tools. Previously Jānis worked as a postdoc in Ran Levi's group at Aberdeen, and completed his PhD under Ben Antieau at the University of Illinois at Chicago. As an instructor and administrator of undergraduate mathematics courses, Jānis pushes for more inclusion and equity through better teaching methods and modified assessments.

Abstract

Models of animal brains are increasingly common and mapped in increasing detail. To simplify analysis of their function, we consider subregions and show that they perform well as classifiers of overall activity, with only a fraction of the neurons. The uniqueness of such ''reliable'' regions seems to be related to the types of connections that pairs of neurons form in them. By focusing on topologically significant structures and reciprocally connected neurons we find even stronger classification results. This is ongoing work across several institutions, including EPFL, the Blue Brain Project, and the University of Aberdeen.

Fri, 28 Oct 2022

15:00 - 16:00
L5

Topological Data Analytic Frameworks for Discovering Biophysical Signatures in 3D Shapes and Images

Lorin Crawford
(Brown University)
Further Information

Lorin Crawford is the RGSS Assistant Professor of Biostatistics at Brown University. He is affiliated with the Center for Statistical Sciences, Center for Computational Molecular Biology, and the Robert J. and Nancy D. Carney Institute for Brain Science.

Abstract
Fri, 21 Oct 2022

15:00 - 16:00
L5

Kan Extensions and Kan Ensembles in Machine Learning

Dan Shiebler
(Abnormal Security)
Further Information

Right now Dan works as the Head of Machine Learning at Abnormal Security. Previously. He led the Web Ads Machine Learning team at Twitter. Before that he worked as a Staff ML Engineer at Twitter Cortex and a Senior Data Scientist at TrueMotion.

His PhD research at the University of Oxford focused on applications of Category Theory to Machine Learning (advised by Jeremy Gibbons and Cezar Ionescu). Before that he worked as a Computer Vision Researcher at the Serre Lab.

 

You can find out more about Dan here: https://danshiebler.com/ 

Abstract

A common problem in data science is "use this function defined over this small set to generate predictions over that larger set." Extrapolation, interpolation, statistical inference and forecasting all reduce to this problem. The Kan extension is a powerful tool in category theory that generalizes this notion. In this work we explore several applications of Kan extensions to data science. We begin by deriving simple classification and clustering algorithms as Kan extensions and experimenting with these algorithms on real data. Next, we build more complex and resilient algorithms from these simple parts.

Fri, 14 Oct 2022

15:00 - 16:00
L5

Applied Topology for Discrete Structures

Emilie Purvine
(Pacific Northwest National Laboratory)
Further Information

(From PNNL website)

Emilie's academic background is in pure mathematics, with a BS from University of Wisconsin - Madison and a PhD from Rutgers University, her research since joining PNNL in 2011 has focused on applications of combinatorics and computational topology together with theoretical advances needed to support the applications. Over her time at PNNL, Purvine has served as both a primary investigator and technical staff member on several projects in applications ranging from computational chemistry and biology to cybersecurity and power grid modeling. She has authored over 40 technical publications and is currently an associate editor for the Notices of the American Mathematical Society. Purvine also coordinates PNNL’s Postgraduate Organization which plans career development seminars, an annual research symposium, and promotes networking and mentorship for PNNL’s post bachelors, post masters, and post doctorate research associates.

Abstract

Discrete structures have a long history of use in applied mathematics. Graphs and hypergraphs provide models of social networks, biological systems, academic collaborations, and much more. Network science, and more recently hypernetwork science, have been used to great effect in analyzing these types of discrete structures. Separately, the field of applied topology has gathered many successes through the development of persistent homology, mapper, sheaves, and other concepts. Recent work by our group has focused on the convergence of these two areas, developing and applying topological concepts to study discrete structures that model real data.

This talk will survey our body of work in this area showing our work in both the theoretical and applied spaces. Theory topics will include an introduction to hypernetwork science and its relation to traditional network science, topological interpretations of graphs and hypergraphs, and dynamics of topology and network structures. I will show examples of how we are applying each of these concepts to real data sets.

 

 

 

Fri, 10 Jun 2022
15:00
L3

Directed networks through simplicial paths and Hochschild homology

Henri Riihimäki
(KTH Royal Institute of Technology)
Abstract

Directed graphs are a model for various phenomena in the
sciences. In topological data analysis particularly the advent of
applying topological tools to networks of brain neurons has spawned
interest in constructing topological spaces out of digraphs, developing
computational tools for obtaining topological information, and using
these to understand networks. At the end of the day, (homological)
computations of the spaces reveal something about the geometric
realisation, thereby losing the directionality information.

However, digraphs can also be associated with path algebras. We can now
consider applying Hochschild homology to extract information, hopefully
obtaining something more refined in terms of the combinatorics of the
directed edges and paths in the digraph. Unfortunately, Hochschild
homology tends to vanish beyond degree 1. We can overcome this by
considering different higher paths of simplices, and thus introduce
Hochschild homology of digraphs in higher degrees. Moreover, this
procedure gives an implementable persistence pipeline for network
analysis. This is a joint work with Luigi Caputi.

Fri, 03 Jun 2022
15:00
L3

Projected barcodes : a new class of invariants and distances for multi-parameter persistence modules

Nicolas Berkouk
(École Polytechnique Fédérale de Lausanne (EPFL))
Abstract

In this talk, we will present a new class of invariants of multi-parameter persistence modules : \emph{projected barcodes}. Relying on Grothendieck's six operations for sheaves, projected barcodes are defined as derived pushforwards of persistence modules onto $\R$ (which can be seen as sheaves on a vector space in a precise sense). We will prove that the well-known fibered barcode is a particular instance of projected barcodes. Moreover, our construction is able to distinguish persistence modules that have the same fibered barcodes but are not isomorphic. We will present a systematic study of the stability of projected barcodes. Given F a subset of the 1-Lipschitz functions, this leads us to define a new class of well-behaved distances between persistence modules, the  F-Integral Sheaf Metrics (F-ISM), as the supremum over p in F of the bottleneck distance of the projected barcodes by p of two persistence modules. 

In the case where M is the collection in all degrees of the sublevel-sets persistence modules of a function f : X -> R^n, we prove that the projected barcode of M by a linear map p : R^n \to R is nothing but the collection of sublevel-sets barcodes of the post-composition of f by p. In particular, it can be computed using already existing softwares, without having to compute entirely M. We also provide an explicit formula for the gradient with respect to p of the bottleneck distance between projected barcodes, allowing to use a gradient ascent scheme of approximation for the linear ISM. This is joint work with François Petit.

 

Fri, 20 May 2022

15:00 - 16:00
L3

Approximating Persistent Homology for Large Datasets

Anthea Monod
(Imperial College London)
Abstract

Persistent homology is an important methodology from topological data analysis which adapts theory from algebraic topology to data settings and has been successfully implemented in many applications. It produces a statistical summary in the form of a persistence diagram, which captures the shape and size of the data. Despite its widespread use, persistent homology is simply impossible to implement when a dataset is very large. In this talk, I will address the problem of finding a representative persistence diagram for prohibitively large datasets. We adapt the classical statistical method of bootstrapping, namely, drawing and studying smaller multiple subsamples from the large dataset. We show that the mean of the persistence diagrams of subsamples—taken as a mean persistence measure computed from the subsamples—is a valid approximation of the true persistent homology of the larger dataset. We give the rate of convergence of the mean persistence diagram to the true persistence diagram in terms of the number of subsamples and size of each subsample. Given the complex algebraic and geometric nature of persistent homology, we adapt the convexity and stability properties in the space of persistence diagrams together with random set theory to achieve our theoretical results for the general setting of point cloud data. We demonstrate our approach on simulated and real data, including an application of shape clustering on complex large-scale point cloud data.

 

This is joint work with Yueqi Cao (Imperial College London).

Fri, 13 May 2022

15:00 - 16:00
L2

Non-Euclidean Data Analysis (and a lot of questions)

John Aston
(University of Cambridge)
Abstract

The statistical analysis of data which lies in a non-Euclidean space has become increasingly common over the last decade, starting from the point of view of shape analysis, but also being driven by a number of novel application areas. However, while there are a number of interesting avenues this analysis has taken, particularly around positive definite matrix data and data which lies in function spaces, it has increasingly raised more questions than answers. In this talk, I'll introduce some non-Euclidean data from applications in brain imaging and in linguistics, but spend considerable time asking questions, where I hope the interaction of statistics and topological data analysis (understood broadly) could potentially start to bring understanding into the applications themselves.

Fri, 06 May 2022

15:00 - 16:00
L4

Applied Topology TBC

Bernadette Stolz
(University of Oxford, Mathematical Institute)
Fri, 29 Apr 2022

15:00 - 16:00
L4

Signed barcodes for multiparameter persistence

Magnus Botnan
(Free University of Amsterdam)
Abstract

Moving from persistent homology in one parameter to multiparameter persistence comes at a significant increase in complexity. In particular, the notion of a barcode does not generalize straightforwardly. However, in this talk, I will show how it is possible to assign a unique barcode to a multiparameter persistence module if one is willing to take Z-linear combinations of intervals. The theoretical discussion will be complemented by numerical experiments. This is joint work with Steffen Oppermann and Steve Oudot.

Fri, 04 Mar 2022

15:00 - 16:00
L6

Open questions on protein topology in its natural environment.

Christopher Prior
(Durham University)
Abstract

Small angle x-ray scattering is one of the most flexible and readily available experimental methods for obtaining information on the structure of proteins in solution. In the advent of powerful predictive methods such as the alphaFold and rossettaFold algorithms, this information has become increasingly in demand, owing to the need to characterise the more flexible and varying components of proteins which resist characterisation by these and more standard experimental techniques. To deal with structures about little of which is known a parsimonious method of representing the tertiary fold of a protein backbone as a discrete curve has been developed. It represents the fundamental local Ramachandran constraints through a pair of parameters and is able to generate millions of potentially realistic protein geometries in a short space of time. The data obtained from these methods provides a treasure trove of information on the potential range of topological structures available to proteins, which is much more constrained that that available to self-avoiding walks, but still far more complex than currently understood from existing data. I will introduce this method and its considerations then attempt to pose some questions I think topological data analysis might help answer. Along the way I will ask why roadies might also help give us some insight….

Fri, 25 Feb 2022

15:00 - 16:00
L6

Homotopy, Homology, and Persistent Homology using Cech’s Closure Spaces

Peter Bubenik
(University of Florida)
Abstract

We use Cech closure spaces, also known as pretopological spaces, to develop a uniform framework that encompasses the discrete homology of metric spaces, the singular homology of topological spaces, and the homology of (directed) clique complexes, along with their respective homotopy theories. We obtain nine homology and six homotopy theories of closure spaces. We show how metric spaces and more general structures such as weighted directed graphs produce filtered closure spaces. For filtered closure spaces, our homology theories produce persistence modules. We extend the definition of Gromov-Hausdorff distance to filtered closure spaces and use it to prove that our persistence modules and their persistence diagrams are stable. We also extend the definitions Vietoris-Rips and Cech complexes to closure spaces and prove that their persistent homology is stable.

This is joint work with Nikola Milicevic.

Fri, 11 Feb 2022

15:00 - 16:00
L2

Topology-Based Graph Learning

Bastian Rieck
(Helmholtz Zentrum München)
Abstract

Topological data analysis is starting to establish itself as a powerful and effective framework in machine learning , supporting the analysis of neural networks, but also driving the development of novel algorithms that incorporate topological characteristics. As a problem class, graph representation learning is of particular interest here, since graphs are inherently amenable to a topological description in terms of their connected components and cycles. This talk will provide
an overview of how to address graph learning tasks using machine learning techniques, with a specific focus on how to make such techniques 'topology-aware.' We will discuss how to learn filtrations for graphs and how to incorporate topological information into modern graph neural networks, resulting in provably more expressive algorithms. This talk aims to be accessible to an audience of TDA enthusiasts; prior knowledge of machine learning is helpful but not required.

Fri, 04 Feb 2022

11:00 - 12:00
L6

Computing the Extended Persistent Homology Transform of binary images

Katharine Turner
(Australian National University)
Further Information

PLEASE NOTE this seminar will be at 11am instead of 3pm.

Abstract

The Persistent Homology Transform, and the Euler Characteristic Transform are topological analogs of the Radon transform that can be used in statsistical shape analysis. In this talk I will consider an interesting variant called the Extended Persistent Homology Transform (XPHT) which replaces the normal persistent homology with extended persistent homology. We are particularly interested in the application of the XPHT to binary images. This paper outlines an algorithm for efficient calculation of the XPHT exploting relationships between the PHT of the boundary curves to the XPHT of the foreground.

Fri, 28 Jan 2022

15:00 - 16:00
L6

Topological Tools for Signal Processing

Sarah Tymochko
(Michigan State University)
Abstract

Topological data analysis (TDA) is a field with tools to quantify the shape of data in a manner that is concise and robust using concepts from algebraic topology. Persistent homology, one of the most popular tools in TDA, has proven useful in applications to time series data, detecting shape that changes over time and quantifying features like periodicity. In this talk, I will present two applications using tools from TDA to study time series data: the first using zigzag persistence, a generalization of persistent homology, to study bifurcations in dynamical systems and the second, using the shape of weighted, directed networks to distinguish periodic and chaotic behavior in time series data.

Fri, 21 Jan 2022

15:00 - 16:00
L6

A Multivariate CLT for Dissociated Sums with Applications to Random Complexes

Tadas Temčinas
(Mathematical Institute)
Abstract

Acyclic partial matchings on simplicial complexes play an important role in topological data analysis by facilitating efficient computation of (persistent) homology groups. Here we describe probabilistic properties of critical simplex counts for such matchings on clique complexes of Bernoulli random graphs. In order to accomplish this goal, we generalise the notion of a dissociated sum to a multivariate setting and prove an abstract multivariate central limit theorem using Stein's method. As a consequence of this general result, we are able to extract central limit theorems not only for critical simplex counts, but also for generalised U-statistics (and hence for clique counts in Bernoulli random graphs) as well as simplex counts in the link of a fixed simplex in a random clique complex.

Fri, 10 Dec 2021

15:00 - 16:00
Virtual

A topological approach to signatures

Darrick Lee
(EPFL)
Abstract

The path signature is a characterization of paths that originated in Chen's iterated integral cochain model for path spaces and loop spaces. More recently, it has been used to form the foundations of rough paths in stochastic analysis, and provides an effective feature map for sequential data in machine learning. In this talk, we return to the topological foundations in Chen's construction to develop generalizations of the signature.

Fri, 26 Nov 2021

15:00 - 16:00
Virtual

Morse inequalities for the Koszul complex of multi-persistence

Claudia Landi
(University of Modena and Reggio Emilia)
Abstract

In this talk, I'll present inequalities bounding the number of critical cells in a filtered cell complex on the one hand, and the entries of the Betti tables of the multi-parameter persistence modules of such filtrations on the other hand. Using the Mayer-Vietoris spectral sequence we first obtain strong and weak Morse inequalities involving the above quantities, and then we improve the weak inequalities achieving a sharp lower bound for the number of critical cells. Furthermore, we prove a sharp upper bound for the minimal number of critical cells, expressed again in terms of the entries of Betti tables. This is joint work with Andrea Guidolin (KTH, Stockholm). The full paper is posted online as arxiv:2108.11427.

Fri, 05 Nov 2021

15:00 - 16:00
Virtual

Why should one care about metrics on (multi) persistent modules?

Wojciech Chacholski
(KTH)
Abstract

What do we use metrics on persistent modules for? Is it only to asure  stability of some constructions? 

In my talk I will describe why I care about such metrics, show how to construct a rich space of them and illustrate how  to use

them for analysis. 

Fri, 29 Oct 2021

15:00 - 16:00
Virtual

Modeling shapes and fields: a sheaf theoretic perspective

Sayan Mukherjee
(Duke University)
Abstract

We will consider modeling shapes and fields via topological and lifted-topological transforms. 

Specifically, we show how the Euler Characteristic Transform and the Lifted Euler Characteristic Transform can be used in practice for statistical analysis of shape and field data. The Lifted Euler Characteristic is an alternative to the. Euler calculus developed by Ghrist and Baryshnikov for real valued functions. We also state a moduli space of shapes for which we can provide a complexity metric for the shapes. We also provide a sheaf theoretic construction of shape space that does not require diffeomorphisms or correspondence. A direct result of this sheaf theoretic construction is that in three dimensions for meshes, 0-dimensional homology is enough to characterize the shape.

Fri, 22 Oct 2021

15:00 - 16:00
Virtual

Combinatorial Laplacians in data analysis: applications in genomics

Pablo Camara
(University of Pennsylvania)
Further Information

Pablo G. Cámara is an Assistant Professor of Genetics at the University of Pennsylvania and a faculty member of the Penn Institute for Biomedical Informatics. He received a Ph.D. in Theoretical Physics in 2006 from Universidad Autónoma de Madrid. He performed research in string theory for several years, with postdoctoral appointments at Ecole Polytechnique, the European Organization for Nuclear Research (CERN), and University of Barcelona. Fascinated by the extremely interesting and fundamental open questions in biology, in 2014 he shifted his research focus into problems in quantitative biology, and joined the groups of Dr. Rabadan, at Columbia University, and Dr. Levine, at the Institute for Advanced Study (Princeton). Building upon techniques from applied topology and statistics, he has devised novel approaches to the inference of ancestral recombination, human recombination mapping, the study of cancer heterogeneity, and the analysis of single-cell RNA-sequencing data from dynamic and heterogeneous cellular populations.

Abstract

One of the prevailing paradigms in data analysis involves comparing groups of samples to statistically infer features that discriminate them. However, many modern applications do not fit well into this paradigm because samples cannot be naturally arranged into discrete groups. In such instances, graph techniques can be used to rank features according to their degree of consistency with an underlying metric structure without the need to cluster the samples. Here, we extend graph methods for feature selection to abstract simplicial complexes and present a general framework for clustering-independent analysis. Combinatorial Laplacian scores take into account the topology spanned by the data and reduce to the ordinary Laplacian score when restricted to graphs. We show the utility of this framework with several applications to the analysis of gene expression and multi-modal cancer data. Our results provide a unifying perspective on topological data analysis and manifold learning approaches to the analysis of point clouds.

Fri, 15 Oct 2021

15:00 - 16:00

Exemplars of Sheaf Theory in TDA

Justin Curry
(University of Albany)
Abstract

In this talk I will present four case studies of sheaves and cosheaves in topological data analysis. The first two are examples of (co)sheaves in the small:

(1) level set persistence---and its efficacious computation via discrete Morse theory---and,

(2) decorated merge trees and Reeb graphs---enriched topological invariants that have enhanced classification power over traditional TDA methods. The second set of examples are focused on (co)sheaves in the large:

(3) understanding the space of merge trees as a stratified map to the space of barcodes and

(4) the development of a new "sheaf of sheaves" that organizes the persistent homology transform over different shapes.

Fri, 04 Jun 2021

15:00 - 16:00
Virtual

Topological and geometric analysis of graphs - Yusu Wang

Yusu Wang
(University of San Diego)
Abstract

In recent years, topological and geometric data analysis (TGDA) has emerged as a new and promising field for processing, analyzing and understanding complex data. Indeed, geometry and topology form natural platforms for data analysis, with geometry describing the ''shape'' behind data; and topology characterizing / summarizing both the domain where data are sampled from, as well as functions and maps associated with them. In this talk, I will show how topological (and geometric ideas) can be used to analyze graph data, which occurs ubiquitously across science and engineering. Graphs could be geometric in nature, such as road networks in GIS, or relational and abstract. I will particularly focus on the reconstruction of hidden geometric graphs from noisy data, as well as graph matching and classification. I will discuss the motivating applications, algorithm development, and theoretical guarantees for these methods. Through these topics, I aim to illustrate the important role that topological and geometric ideas can play in data analysis.

Fri, 28 May 2021

15:00 - 16:00
Virtual

The applications and algorithms of correspondence modules - Haibin Hang

Haibin Hang
(University of Delaware)
Abstract

 In this work we systematically introduce relations to topological data analysis (TDA) in the categories of sets, simplicial complexes and vector spaces to characterize and study the general dynamical behaviors in a consistent way. The proposed framework not only offers new insights to the classical TDA methodologies, but also motivates new approaches to interesting applications of TDA in dynamical metric spaces, dynamical coverings, etc. The associated algorithm which produces barcode invariants, and relations in more general categories will also be discussed.

Fri, 21 May 2021

15:00 - 16:00
Virtual

Persistent Laplacians: properties, algorithms and implications - Zhengchao Wan

Zhengchao Wan
(Ohio State University)
Abstract

In this work we present a thorough study of the theoretical properties and devise efficient algorithms for the persistent Laplacian, an extension of the standard combinatorial Laplacian to the setting of simplicial pairs: pairs of simplicial complexes related by an inclusion, which was recently introduced by Wang, Nguyen, and Wei. 

In analogy with the non-persistent case, we establish that the nullity of the q-th persistent Laplacian equals the q-th persistent Betti number of any given simplicial pair which provides an interesting connection between spectral graph theory and TDA. 

We further exhibit a novel relationship between the persistent Laplacian and the notion of Schur complement of a matrix. This relation permits us to uncover a link with the notion of effective resistance from network circuit theory and leads to a persistent version of the Cheeger inequality.

This relationship also leads to a novel and fundamentally different algorithm for computing the persistent Betti number for a pair of simplicial complexes which can be significantly more efficient than standard algorithms. 

Fri, 07 May 2021

15:00 - 16:00
Virtual

Investigating Collective Behaviour and Phase Transitions in Active Matter using TDA - Dhananjay Bhaskar

Dhananjay Bhaskar
(Brown University)
Abstract

Active matter systems, ranging from liquid crystals to populations of cells and animals, exhibit complex collective behavior characterized by pattern formation and dynamic phase transitions. However, quantitative analysis of these systems is challenging, especially for heterogeneous populations of varying sizes, and typically requires expertise in formulating problem-specific order parameters. I will describe an alternative approach, using a combination of topological data analysis and machine learning, to investigate emergent behaviors in self-organizing populations of interacting discrete agents.

Fri, 30 Apr 2021

15:00 - 16:00
Virtual

Sketching Persistence Diagrams, Don Sheehy

Don Sheehy
(North Carolina State)
Further Information

Don Sheehy is an Associate Professor of Computer Science at North Carolina State University.  He received his B.S.E. from Princeton University and his Ph.D. in Computer Science from Carnegie Mellon University.   He spent two years as a postdoc at Inria Saclay in France.  His research is in algorithms and data structures in computational geometry and topological data analysis.  

Abstract

Given a persistence diagram with n points, we give an algorithm that produces a sequence of n persistence diagrams converging in bottleneck distance to the input diagram, the ith of which has i distinct (weighted) points and is a 2-approximation to the closest persistence diagram with that many distinct points. For each approximation, we precompute the optimal matching between the ith and the (i+1)st. Perhaps surprisingly, the entire sequence of diagrams as well as the sequence of matchings can be represented in O(n) space. The main approach is to use a variation of the greedy permutation of the persistence diagram to give good Hausdorff approximations and assign weights to these subsets. We give a new algorithm to efficiently compute this permutation, despite the high implicit dimension of points in a persistence diagram due to the effect of the diagonal. The sketches are also structured to permit fast (linear time) approximations to the Hausdorff distance between diagrams -- a lower bound on the bottleneck distance. For approximating the bottleneck distance, sketches can also be used to compute a linear-size neighborhood graph directly, obviating the need for geometric data structures used in state-of-the-art methods for bottleneck computation.

Fri, 12 Mar 2021

15:00 - 16:00
Virtual

Chain complex reduction via fast digraph traversal

Leon Lampret
(Queen Mary University London)
Abstract

Reducing a chain complex (whilst preserving its homotopy-type) using algebraic Morse theory ([1, 2, 3]) gives the same end-result as Gaussian elimination, but AMT does it only on certain rows/columns and with several pivots (in all matrices simultaneously). Crucially, instead of doing costly row/column operations on a sparse matrix, it computes traversals of a bipartite digraph. This significantly reduces the running time and memory load (smaller fill-in and coefficient growth of the matrices). However, computing with AMT requires the construction of a valid set of pivots (called a Morse matching).

In [4], we discover a family of Morse matchings on any chain complex of free modules of finite rank. We show that every acyclic matching is a subset of some member of our family, so all maximal Morse matchings are of this type.

Both the input and output of AMT are chain complexes, so the procedure can be used iteratively. When working over a field or a local PID, this process ends in a chain complex with zero matrices, which produces homology. However, even over more general rings, the process often reveals homology, or at least reduces the complex so much that other algorithms can finish the job. Moreover, it also returns homotopy equivalences to the reduced complexes, which reveal the generators of homology and the induced maps $H_{*}(\varphi)$. 

We design a new algorithm for reducing a chain complex and implement it in MATHEMATICA. We test that it outperforms other CASs. As a special case, given a sparse matrix over any field, the algorithm offers a new way of computing the rank and a sparse basis of the kernel (or null space), cokernel (or quotient space, or complementary subspace), image, preimage, sum and intersection subspace. It outperforms built-in algorithms in other CASs.

References

[1]  M. Jöllenbeck, Algebraic Discrete Morse Theory and Applications to Commutative Algebra, Thesis, (2005).

[2]  D.N. Kozlov, Discrete Morse theory for free chain complexes, C. R. Math. 340 (2005), no. 12, 867–872.

[3]  E. Sköldberg, Morse theory from an algebraic viewpoint, Trans. Amer. Math. Soc. 358 (2006), no. 1, 115–129.

[4]  L. Lampret, Chain complex reduction via fast digraph traversal, arXiv:1903.00783.

Fri, 26 Feb 2021

15:00 - 16:00

A simplicial extension of node2vec

Celia Hacker
(École Polytechnique Fédérale de Lausanne (EPFL))
Abstract

The well known node2vec algorithm has been used to explore network structures and represent the nodes of a graph in a vector space in a way that reflects the structure of the graph. Random walks in node2vec have been used to study the local structure through pairwise interactions. Our motivation for this project comes from a desire to understand higher-order relationships by a similar approach. To this end, we propose an extension of node2vec to a method for representing the k-simplices of a simplicial complex into Euclidean space. 

In this talk I outline a way to do this by performing random walks on simplicial complexes, which have a greater variety of adjacency relations to take into account than in the case of graphs. The walks on simplices are then used to obtain a representation of the simplices. We will show cases in which this method can uncover the roles of higher order simplices in a network and help understand structures in graphs that cannot be seen by using just the random walks on the nodes. 

Fri, 12 Feb 2021

15:00 - 16:00
Virtual

Applications of Topology and Geometry to Crystal Structure Prediction

Phil Smith
(University of Liverpool)
Abstract

Crystal Structure Prediction aims to reveal the properties that stable crystalline arrangements of a molecule have without stepping foot in a laboratory, consequently speeding up the discovery of new functional materials. Since it involves producing large datasets that themselves have little structure, an appropriate classification of crystals could add structure to these datasets and further streamline the process. We focus on geometric invariants, in particular introducing the density fingerprint of a crystal. After exploring its computations via Brillouin zones, we go on to show how it is invariant under isometries, stable under perturbations and complete at least for an open and dense space of crystal structures.

 

Fri, 29 Jan 2021

15:00 - 16:00
Virtual

Seeing Data through the lens of Geometry (Ollivier Ricci Curvature)

Marzieh Eidi
(Max Planck Institute Leipzig)
Abstract

Ollivier Ricci curvature is a notion originated from Riemannian Geometry and suitable for applying on different settings from smooth manifolds to discrete structures such as (directed) hypergraphs. In the past few years, alongside Forman Ricci curvature, this curvature as an edge based measure, has become a popular and powerful tool for network analysis. This notion is defined based on optimal transport problem (Wasserstein distance) between sets of probability measures supported on data points and can nicely detect some important features such as clustering and sparsity in their structures. After introducing this notion for (directed) hypergraphs and mentioning some of its properties, as one of the main recent applications, I will present the result of implementation of this tool for the analysis of chemical reaction networks. 

Thu, 14 Jan 2021

10:00 - 12:00
Virtual

An invitation to matroid theory - Day 3, Lectures 1 & 2

Greg Henselman-Petrusek
(Mathematical Institute)
Further Information

Zoom passcode: Basis

Abstract

Giancarlo Rota once wrote of matroids that "It is as if one were to
condense all trends of present day mathematics onto a single
structure, a feat that anyone would a priori deem impossible, were it
not for the fact that matroids do exist" (Indiscrete Thoughts, 1997).
This makes matroid theory a natural hub through which ideas flow from
one field of mathematics to the next. At the end of our three-day
workshop, participants will understand the most common objects and
constructions in matroid theory to the depth suitable for exploring
many of these interesting connections. We will also pick up some
highly practical matroid tools for working through problems in
persistent homology, (optimal) cycle representatives, and other
objects of interest in TDA.

 

Day 3, Lecture 1

Circuits in persistent homology


Day 3, Lecture 2

Exercise: write your own persistent homology algorithm!