We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

stat.ML

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Statistics > Machine Learning

Title: Linear Time Sinkhorn Divergences using Positive Features

Abstract: Although Sinkhorn divergences are now routinely used in data sciences to compare probability distributions, the computational effort required to compute them remains expensive, growing in general quadratically in the size $n$ of the support of these distributions. Indeed, solving optimal transport (OT) with an entropic regularization requires computing a $n\times n$ kernel matrix (the neg-exponential of a $n\times n$ pairwise ground cost matrix) that is repeatedly applied to a vector. We propose to use instead ground costs of the form $c(x,y)=-\log\dotp{\varphi(x)}{\varphi(y)}$ where $\varphi$ is a map from the ground space onto the positive orthant $\RR^r_+$, with $r\ll n$. This choice yields, equivalently, a kernel $k(x,y)=\dotp{\varphi(x)}{\varphi(y)}$, and ensures that the cost of Sinkhorn iterations scales as $O(nr)$. We show that usual cost functions can be approximated using this form. Additionaly, we take advantage of the fact that our approach yields approximation that remain fully differentiable with respect to input distributions, as opposed to previously proposed adaptive low-rank approximations of the kernel matrix, to train a faster variant of OT-GAN \cite{salimans2018improving}.
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)
Cite as: arXiv:2006.07057 [stat.ML]
  (or arXiv:2006.07057v3 [stat.ML] for this version)

Submission history

From: Meyer Scetbon [view email]
[v1] Fri, 12 Jun 2020 10:21:40 GMT (2859kb,D)
[v2] Thu, 25 Jun 2020 09:52:20 GMT (2859kb,D)
[v3] Mon, 26 Oct 2020 07:55:17 GMT (3280kb,D)

Link back to: arXiv, form interface, contact.