We describe how cross-kernel matrices, that is, kernel matrices between the data and a custom chosen set of `feature spanning points' can be used for learning. The main potential of cross-kernel matrices is that (a) they provide Nyström-type speed-ups for kernel learning without relying on subsampling, thus avoiding potential problems with sampling degeneracy, while preserving the usual approximation guarantees and the attractive linear scaling of standard Nyström methods and (b) the use of non-square matrices for kernel learning provides a non-linear generalization of the singular value decomposition and singular features. We present a novel algorithm, Ideal PCA (IPCA), which is a cross-kernel matrix variant of PCA, showcasing both advantages: we demonstrate on real and synthetic data that IPCA allows to (a) obtain kernel PCA-like features faster and (b) to extract novel features of empirical advantage in non-linear manifold learning and classification.