Permutation compressors for provably faster distributed nonconvex optimization
Abstract
iii) identify a special class of correlated compressors based on the idea of random permutations, for which we coin the term PermK. The use of this technique results in the strict improvement on the previous MARINA rate. In the low Hessian variance regime, the improvement can be as large as √n, when d > n, and 1 + √d/n, when n<=d, where n is the number of workers and d is the number of parameters describing the model we are learning.