Date
Tue, 24 Jan 2023
14:30
Location
L3
Speaker
Zhen Shao

We investigate the theoretical properties of subsampling and hashing as tools for approximate Euclidean norm-preserving embeddings for vectors with (unknown) additive Gaussian noises. Such embeddings are called Johnson-Lindenstrauss embeddings due to their celebrated lemma. Previous work shows that as sparse embeddings, if a comparable embedding dimension to the Gaussian matrices is required, the success of subsampling and hashing closely depends on the $l_\infty$ to $l_2$ ratios of the vectors to be mapped. This paper shows that the presence of noise removes such constrain in high-dimensions; in other words, sparse embeddings such as subsampling and hashing with comparable embedding dimensions to dense embeddings have similar norm-preserving dimensionality-reduction properties, regardless of the $l_\infty$ to $l_2$ ratios of the vectors to be mapped. The key idea in our result is that the noise should be treated as information to be exploited, not simply a nuisance to be removed. Numerical illustrations show better performances of sparse embeddings in the presence of noise.

Last updated on 20 Jan 2023, 1:50pm. Please contact us with feedback and comments about this page.