Linear Time Sinkhorn Divergences using Positive Features
Authors: Meyer Scetbon, Marco Cuturi
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In Figures 1,3 we plot the deviation from ground truth, defined as D := 100 ROT d ROT |ROT| + 100, and show the time-accuracy tradeoff for our proposed method RF, Nystrom Nys [3] and Sinkhorn Sin [16], for a range of regularization parameters |
| Researcher Affiliation | Collaboration | Meyer Scetbon CREST, ENSAE, Institut Polytechnique de Paris, meyer.scetbon@ensae.fr Marco Cuturi Google Brain, CREST, ENSAE, cuturi@google.com |
| Pseudocode | Yes | Algorithm 1 Sinkhorn Inputs: K, a, b, δ, u repeat v b/KT u, u a/Kv until kv KT u bk1 < δ; Result: u, v |
| Open Source Code | Yes | The code is available at github.com/meyerscetbon/Linear Sinkhorn. |
| Open Datasets | Yes | We train our GAN models on a Tesla K80 GPU for 84 hours on two different datasets, namely CIFAR-10 dataset [35] and Celeb A dataset [38] |
| Dataset Splits | No | The paper uses datasets like CIFAR-10 and Celeb A but does not specify the exact training, validation, and test splits (e.g., percentages or sample counts) used for reproduction. |
| Hardware Specification | Yes | We train our GAN models on a Tesla K80 GPU for 84 hours on two different datasets, namely CIFAR-10 dataset [35] and Celeb A dataset [38] |
| Software Dependencies | No | The paper mentions the use of neural networks and GANs, implying common deep learning frameworks, but does not provide specific version numbers for software dependencies like Python, PyTorch, or TensorFlow. |
| Experiment Setup | Yes | More precisely we take the exact same functions used in [46, 36] to define g and fγ. Moreover, ' is the feature map associated to the Gaussian kernel defined in Lemma 1 where is initialised with a normal distribution. The number of random features considered has been fixed to be r = 600 in the following. The training procedure is the same as [27, 36] and consists in alterning nc optimisation steps to train the cost function c hγ and an optimisation step to train the generator g . (...) where we set the number of batches s = 7000, the regularization = 1, and the number of features r = 600. |