Invertible Gaussian Reparameterization: Revisiting the Gumbel-Softmax

Authors: Andres Potapczynski, Gabriel Loaiza-Ganem, John P. Cunningham

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our construction enjoys theoretical advantages over the Gumbel Softmax, such as closed form KL, and significantly outperforms it in a variety of experiments.
Researcher Affiliation Collaboration Andres Potapczynski Zuckerman Institute Columbia University ap3635@columbia.edu Gabriel Loaiza-Ganem Layer 6 AI gabriel@layer6.ai John P. Cunningham Department of Statistics Columbia University jpc2181@columbia.edu
Pseudocode No The paper does not contain any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code Yes Our code is available at https://github.com/cunningham-lab/ igr.
Open Datasets Yes The datasets we use are handwritten digits from MNIST, fashion items from FMNIST and alphabet symbols from Omniglot.
Dataset Splits Yes We thus choose the temperature hyperparameter through cross validation, considering the range of possible temperatures {0.01, 0.03, 0.07, 0.1, 0.25, 0.4, 0.5, 0.67, 0.85, 1.0} and compare best-performing models.
Hardware Specification No No specific hardware details (like CPU/GPU models, memory) used for running the experiments are provided in the paper.
Software Dependencies No The paper does not provide specific software dependencies with version numbers.
Experiment Setup Yes For the experiments involving a KL term, we use variational autoencoders (VAEs) [14]. We trained VAEs composed of 20 discrete variables with 10 categories each. For MNIST and Omniglot we used a fixed binarization and a Bernoulli decoder, whereas for FMNIST we use a Gaussian decoder. We ran each experiment 5 times and report averages plus/minus one standard deviation. We thus choose the temperature hyperparameter through cross validation, considering the range of possible temperatures {0.01, 0.03, 0.07, 0.1, 0.25, 0.4, 0.5, 0.67, 0.85, 1.0} and compare best-performing models.