Learning Latent Permutations with Gumbel-Sinkhorn Networks

Authors: Gonzalo Mena, David Belanger, Scott Linderman, Jasper Snoek

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the effectiveness of our method by outperforming competitive baselines on a range of qualitatively different tasks: sorting numbers, solving jigsaw puzzles, and identifying neural signals in worms.
Researcher Affiliation Collaboration Gonzalo E. Mena Department of Statistics, Columbia University gem2131@columbia.edu David Belanger Google Brain Scott Linderman Department of Statistics, Columbia University Jasper Snoek Google Brain
Pseudocode No The paper describes methods and processes but does not include a section or figure explicitly labeled “Pseudocode” or “Algorithm”.
Open Source Code Yes We have made available Tensorflow code for Gumbel-Sinkhorn networks featuring an implementation of the number sorting experiment at http://github.com/google/gumbel sinkhorn .
Open Datasets Yes In Table 2, we benchmark results for the MNIST, Celeba and Imagenet datasets, with puzzles between 2x2 and 6x6 pieces.
Dataset Splits No The paper uses phrases like ‘test data’ and ‘test examples’ but does not provide specific details about the training, validation, or test dataset splits (e.g., percentages, sample counts, or explicit splitting methodology).
Hardware Specification Yes All experiments were run on a cluster using Tensorflow Abadi et al. (2016), using several GPU (Tesla K20, K40, K80 and P100) in parallel to enable an efficient exploration of the hyperparameter space: temperature, learning rate, and neural network parameters (dimensions).
Software Dependencies No The paper mentions using ‘Tensorflow Abadi et al. (2016)’ but does not specify a version number for Tensorflow or other software libraries used.
Experiment Setup Yes In all cases, we used L = 20 Sinkhorn Operator Iterations, and a 10x10 batch size: for each sample in the batch we used Gumbel perturbations to generate 10 different reconstructions.