Date
Tue, 24 Jan 2023
14:30
Location
L3
Speaker
Zhen Shao

We investigate the theoretical properties of subsampling and hashing as tools for approximate Euclidean norm-preserving embeddings for vectors with (unknown) additive Gaussian noises. Such embeddings are called Johnson-Lindenstrauss embeddings due to their celebrated lemma. Previous work shows that as sparse embeddings, if a comparable embedding dimension to the Gaussian matrices is required, the success of subsampling and hashing closely depends on the $l_\infty$ to $l_2$ ratios of the vectors to be mapped. This paper shows that the presence of noise removes such constrain in high-dimensions; in other words, sparse embeddings such as subsampling and hashing with comparable embedding dimensions to dense embeddings have similar norm-preserving dimensionality-reduction properties, regardless of the $l_\infty$ to $l_2$ ratios of the vectors to be mapped. The key idea in our result is that the noise should be treated as information to be exploited, not simply a nuisance to be removed. Numerical illustrations show better performances of sparse embeddings in the presence of noise.

Please contact us with feedback and comments about this page. Last updated on 20 Jan 2023 13:50.