Categorical Reparameterization with Gumbel-Softmax
Authors: Eric Jang, Shixiang Gu, Ben Poole
ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show that our Gumbel-Softmax estimator outperforms state-of-the-art gradient estimators on structured output prediction and unsupervised generative modeling tasks with categorical latent variables, and enables large speedups on semi-supervised classification. |
| Researcher Affiliation | Collaboration | Eric Jang Google Brain ejang@google.com Shixiang Gu University of Cambridge MPI T ubingen sg717@cam.ac.uk Ben Poole Stanford University poole@cs.stanford.edu |
| Pseudocode | No | No structured pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | No | No concrete access information (e.g., a specific repository link, explicit code release statement) for the source code of the methodology described in the paper was found. |
| Open Datasets | Yes | We use the MNIST dataset with fixed binarization for training and evaluation, which is common practice for evaluating stochastic gradient estimators (Salakhutdinov & Murray, 2008; Larochelle & Murray, 2011). |
| Dataset Splits | No | The paper mentions using a 'validation set' to select hyperparameters: 'we select the best learning rate for each estimator using the MNIST validation set, and report performance on the test set.' However, it does not provide specific details on the dataset splits (e.g., exact percentages or sample counts for the validation split) needed to reproduce the data partitioning. |
| Hardware Specification | Yes | Evaluations were performed on a GTX Titan X R GPU. |
| Software Dependencies | No | The paper mentions software components like 'backpropagation', 'stochastic gradient descent with momentum 0.9', 'sigmoid activation functions', 'softmax activations', and 'ReLU activations'. However, it does not provide specific version numbers for any programming languages, libraries, or frameworks used, which is necessary for reproducible software dependency information. |
| Experiment Setup | Yes | Learning rates are chosen from {3e-5, 1e-5, 3e-4, 1e-4, 3e-3, 1e-3}; we select the best learning rate for each estimator using the MNIST validation set, and report performance on the test set. The temperature is annealed using the schedule τ = max(0.5, exp(−rt)) of the global training step t, where τ is updated every N steps. N ∈ {500, 1000} and r ∈ {1e-5, 1e-4} are hyperparameters for which we select the best-performing estimator on the validation set and report test performance. |