11 April 2019
Proceedings of Machine Learning Research
We propose a new method to estimate Wasserstein distances and optimal transport plans between two probability distributions from samples in high dimension. Unlike plugin rules that simply replace the true distributions by their empirical counterparts, our method promotes couplings with low trans- port rank, a new structural assumption that is similar to the nonnegative rank of a matrix. Regularizing based on this assumption leads to drastic improvements on highdimensional data for various tasks, including domain adaptation in single-cell RNA sequencing data. These findings are supported by a theoretical analysis that indicates that the transport rank is key in overcoming the curse of dimensionality inherent to datadriven optimal transport.
Submitted to ORA: