Hashing embeddings of optimal dimension, with applications to linear least squares

18 May 2021

We investigate theoretical and numerical properties of sparse sketching for both dense and sparse Linear Least Squares (LLS) problems. We show that, sketching with hashing matrices --- with one nonzero entry per column and of size proportional to the rank of the data matrix --- generates a subspace embedding with high probability, provided the given data matrix has low coherence; thus optimal residual values are approximately preserved when the LLS matrix has similarly important rows. We then show that using $s-$hashing matrices, with $s>1$ nonzero entries per column, satisfy similarly good sketching properties for a larger class of low coherence data matrices. Numerically, we introduce our solver Ski-LLS for solving generic dense or sparse LLS problems. Ski-LLS builds upon the successful strategies employed in the Blendenpik and LSRN solvers, that use sketching to calculate a preconditioner before applying the iterative LLS solver LSQR. Ski-LLS significantly improves upon these sketching solvers by judiciously using sparse hashing sketching while also allowing rank-deficiency of input; furthermore, when the data matrix is sparse, Ski-LLS also applies a sparse factorization to the sketched input. Extensive numerical experiments show Ski-LLS is also competitive with other state-of-the-art direct and preconditioned iterative solvers for sparse LLS, and outperforms them in the significantly over-determined regime.

A link for this talk will be sent to our mailing list a day or two in advance.  If you are not on the list and wish to be sent a link, please contact trefethen@maths.ox.ac.uk.

  • Numerical Analysis Group Internal Seminar