The goal of topological data analysis is to apply tools form algebraic topology to reveal geometric structures hidden within high dimensional data. Mapper is among its most widely and successfully applied tools providing, a framework for the geometric analysis of point cloud data. Given a number of input parameters, the Mapper algorithm constructs a graph, giving rise to a visual representation of the structure of the data. The Mapper graph is a topological representation, where the placement of individual vertices and edges is not important, while geometric features such as loops and flares are revealed.
However, Mappers method is rather ad hoc, and would therefore benefit from a formal approach governing how to make the necessary choices. In this talk I will present joint work with Francisco Belchì, Jacek Brodzki, and Mahesan Niranjan. We study how sensitive to perturbations of the data the graph returned by the Mapper algorithm is given a particular tuning of parameters and how this depend on the choice of those parameters. Treating Mapper as a clustering generalisation, we develop a notion of instability of Mapper and study how it is affected by the choices. In particular, we obtain concrete reasons for high values of Mapper instability and experimentally demonstrate how Mapper instability can be used to determine good Mapper outputs.
Our approach tackles directly the inherent instability of the choice of clustering procedure and requires very few assumption on the specifics of the data or chosen Mapper construction, making it applicable to any Mapper-type algorithm.