Matrix Factorization with Expander Graphs

4 February 2020
Michael Murray

Many computational techniques in data science involve the factorization of a data matrix into the product of two or more structured matrices. Examples include PCA, which relies on computing an SVD, recommendation systems, which leverage non-negative matrix factorization, infilling missing entries with low rank matrix completion, and finding sparse representations via dictionary learning. In our work we study a new matrix factorization problem, involving the recovery of $\textbf{A}$ and $\textbf{X}$ from $\textbf{Y} := \textbf{A}\textbf{X}$ under the following assumptions; $\textbf{A}$ is an $m \times n$ sparse binary matrix with a fixed number $d$ of nonzeros per column and $\textbf{X}$ is an $n \times N$ sparse real matrix whose columns have $k$ nonzeros and are dissociated. This setup is inspired and motivated by similar models studied in the dictionary learning literature as well as potential connections both with stochastic block models and combinatorial compressed sensing. In this talk we present a new algorithm, EBR, for solving this problem, as well as recovery guarantees in the context of a particular probabilistic data model. Using the properties of expander graphs we are able to show, under certain assumptions, that with just $N = \textit{O}( \log^2(n))$ samples then EBR recovers the factorization up to permutation with high probability. 

  • Numerical Analysis Group Internal Seminar