Invertible Gaussian Reparameterization: Revisiting the Gumbel-Softmax
Authors: Andres Potapczynski, Gabriel Loaiza-Ganem, John P. Cunningham
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our construction enjoys theoretical advantages over the Gumbel Softmax, such as closed form KL, and significantly outperforms it in a variety of experiments. |
| Researcher Affiliation | Collaboration | Andres Potapczynski Zuckerman Institute Columbia University ap3635@columbia.edu Gabriel Loaiza-Ganem Layer 6 AI gabriel@layer6.ai John P. Cunningham Department of Statistics Columbia University jpc2181@columbia.edu |
| Pseudocode | No | The paper does not contain any explicitly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/cunningham-lab/ igr. |
| Open Datasets | Yes | The datasets we use are handwritten digits from MNIST, fashion items from FMNIST and alphabet symbols from Omniglot. |
| Dataset Splits | Yes | We thus choose the temperature hyperparameter through cross validation, considering the range of possible temperatures {0.01, 0.03, 0.07, 0.1, 0.25, 0.4, 0.5, 0.67, 0.85, 1.0} and compare best-performing models. |
| Hardware Specification | No | No specific hardware details (like CPU/GPU models, memory) used for running the experiments are provided in the paper. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | For the experiments involving a KL term, we use variational autoencoders (VAEs) [14]. We trained VAEs composed of 20 discrete variables with 10 categories each. For MNIST and Omniglot we used a fixed binarization and a Bernoulli decoder, whereas for FMNIST we use a Gaussian decoder. We ran each experiment 5 times and report averages plus/minus one standard deviation. We thus choose the temperature hyperparameter through cross validation, considering the range of possible temperatures {0.01, 0.03, 0.07, 0.1, 0.25, 0.4, 0.5, 0.67, 0.85, 1.0} and compare best-performing models. |