Differential Properties of Sinkhorn Approximation for Learning with Wasserstein Distance
Authors: Giulia Luise, Alessandro Rudi, Massimiliano Pontil, Carlo Ciliberto
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Promising preliminary experiments complement our analysis. We provide preliminary empirical evidence of the effectiveness of the proposed approach. We present here experiments comparing the two Sinkhorn approximations empirically. |
| Researcher Affiliation | Academia | 1Department of Computer Science, University College London, London, UK. 2INRIA Département d informatique, École Normale Supérieure PSL Research University, Paris, France. 3Istituto Italiano di Tecnologia, Genova, Italy. 4Department of Electrical and Electronic Engineering, Imperial College, London, UK. |
| Pseudocode | Yes | Algorithm 1 Computation of a Sλ(a, b) |
| Open Source Code | Yes | The implementation of this comparison is available online1. 1https://github.com/Giuls Lu/OT-gradients |
| Open Datasets | Yes | Google Quick Draw. We compared the performance of the two estimators on a challenging dataset. We selected c = 2, 4, 10 classes from the Google Quick Draw dataset [38] which consists in images of size 28 28 pixels. [38] Inc Google. Quick Draw Dataset. https://github.com/googlecreativelab/quickdraw-dataset. |
| Dataset Splits | Yes | We trained the structured prediction estimators on 1000 images per class and tested on other 1000 images. We used a Gaussian kernel with bandwith σ and regularization parameter γ selected by cross-validation. |
| Hardware Specification | Yes | Experiments were run on an Intel(R) Xeon(R) CPU E3-1240 v3 @ 3.40GHz with 16GB RAM. |
| Software Dependencies | No | The paper mentions using 'efficient off-the-shelf implementations (BLAS, LAPACK)' and implies the use of Python for the linked code. However, it does not specify explicit version numbers for any software dependencies. |
| Experiment Setup | Yes | We empirically chose the Sinkhorn regularization parameter λ to be the smallest value such that the output Tλ of the Sinkhorn algorithm would be within 10 6 from the transport polytope in 1000 iterations. We compared the gradient obtained with Alg. 1 and Automatic Differentiation (AD) on random histograms with different n (y axis), m (x axis), and reg. λ = 0.02. |