Date
Tue, 04 Feb 2020
14:00
Location
L5
Speaker
Michael Murray
Organisation
Oxford

Many computational techniques in data science involve the factorization of a data matrix into the product of two or more structured matrices. Examples include PCA, which relies on computing an SVD, recommendation systems, which leverage non-negative matrix factorization, infilling missing entries with low rank matrix completion, and finding sparse representations via dictionary learning. In our work we study a new matrix factorization problem, involving the recovery of $\textbf{A}$ and $\textbf{X}$ from $\textbf{Y} := \textbf{A}\textbf{X}$ under the following assumptions; $\textbf{A}$ is an $m \times n$ sparse binary matrix with a fixed number $d$ of nonzeros per column and $\textbf{X}$ is an $n \times N$ sparse real matrix whose columns have $k$ nonzeros and are dissociated. This setup is inspired and motivated by similar models studied in the dictionary learning literature as well as potential connections both with stochastic block models and combinatorial compressed sensing. In this talk we present a new algorithm, EBR, for solving this problem, as well as recovery guarantees in the context of a particular probabilistic data model. Using the properties of expander graphs we are able to show, under certain assumptions, that with just $N = \textit{O}( \log^2(n))$ samples then EBR recovers the factorization up to permutation with high probability. 

Please contact us with feedback and comments about this page. Last updated on 03 Apr 2022 01:32.