Quantifying the Estimation Error of Principal Component

6 February 2020
Raphael Hauser

(Joint work with: Jüri Lember, Heinrich Matzinger, Raul Kangro)

Principal component analysis is an important pattern recognition and dimensionality reduction tool in many applications and are computed as eigenvectors

of a maximum likelihood covariance that approximates a population covariance. The eigenvectors are often used to extract structural information about the variables (or attributes) of the studied population. Since PCA is based on the eigen-decomposition of the proxy covariance rather than the ground-truth, it is important to understand the approximation error in each individual eigenvector as a function of the number of available samples. The combination of recent results of Koltchinskii & Lounici [8] and Yu, Wang & Samworth [11] yields such bounds. In the presented work we sharpen these bounds and show that eigenvectors can often be reconstructed to a required accuracy from a sample of strictly smaller size order.

  • Computational Mathematics and Applications Seminar