Learning Latent Permutations with Gumbel-Sinkhorn Networks
Authors: Gonzalo Mena, David Belanger, Scott Linderman, Jasper Snoek
ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the effectiveness of our method by outperforming competitive baselines on a range of qualitatively different tasks: sorting numbers, solving jigsaw puzzles, and identifying neural signals in worms. |
| Researcher Affiliation | Collaboration | Gonzalo E. Mena Department of Statistics, Columbia University gem2131@columbia.edu David Belanger Google Brain Scott Linderman Department of Statistics, Columbia University Jasper Snoek Google Brain |
| Pseudocode | No | The paper describes methods and processes but does not include a section or figure explicitly labeled “Pseudocode” or “Algorithm”. |
| Open Source Code | Yes | We have made available Tensorflow code for Gumbel-Sinkhorn networks featuring an implementation of the number sorting experiment at http://github.com/google/gumbel sinkhorn . |
| Open Datasets | Yes | In Table 2, we benchmark results for the MNIST, Celeba and Imagenet datasets, with puzzles between 2x2 and 6x6 pieces. |
| Dataset Splits | No | The paper uses phrases like ‘test data’ and ‘test examples’ but does not provide specific details about the training, validation, or test dataset splits (e.g., percentages, sample counts, or explicit splitting methodology). |
| Hardware Specification | Yes | All experiments were run on a cluster using Tensorflow Abadi et al. (2016), using several GPU (Tesla K20, K40, K80 and P100) in parallel to enable an efficient exploration of the hyperparameter space: temperature, learning rate, and neural network parameters (dimensions). |
| Software Dependencies | No | The paper mentions using ‘Tensorflow Abadi et al. (2016)’ but does not specify a version number for Tensorflow or other software libraries used. |
| Experiment Setup | Yes | In all cases, we used L = 20 Sinkhorn Operator Iterations, and a 10x10 batch size: for each sample in the batch we used Gumbel perturbations to generate 10 different reconstructions. |