The amount and complexity of biological data has increased rapidly in recent years with the availability of improved biological tools. When applying persistent homology to large data sets, many of the currently available algorithms however fail due to computational complexity preventing many interesting biological applications. De Silva and Carlsson (2004) introduced the so called Witness Complex that reduces computational complexity by building simplicial complexes on a small subset of landmark points selected from the original data set. The landmark points are chosen from the data either at random or using the so called maxmin algorithm. These approaches are not ideal as the random selection tends to favour dense areas of the point cloud while the maxmin algorithm often selects outliers as landmarks. Both of these problems need to be addressed in order to make the method more applicable to biological data. We study new ways of selecting landmarks from a large data set that are robust to outliers. We further examine the effects of the different subselection methods on the persistent homology of the data.
- Topological Data Analysis Seminar