16:00
The "curse of dimensionality" refers to the host of difficulties that occur when we attempt to extend our intuition about what happens in low dimensions (i.e. when there are only a few features or variables) to very high dimensions (when there are hundreds or thousands of features, such as in genomics or imaging). With very high-dimensional data, there is often an intuition that although the data is nominally very high dimensional, it is typically concentrated around a much lower dimensional, although non-linear set. There are many approaches to identifying and representing these subsets. We will discuss topological approaches, which represent non-linear sets with graphs and simplicial complexes, and permit the "measuring of the shape of the data" as a tool for identifying useful lower dimensional representations.