Categorical Reparameterization with Gumbel-Softmax

Authors: Eric Jang, Shixiang Gu, Ben Poole

ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show that our Gumbel-Softmax estimator outperforms state-of-the-art gradient estimators on structured output prediction and unsupervised generative modeling tasks with categorical latent variables, and enables large speedups on semi-supervised classification.
Researcher Affiliation Collaboration Eric Jang Google Brain ejang@google.com Shixiang Gu University of Cambridge MPI T ubingen sg717@cam.ac.uk Ben Poole Stanford University poole@cs.stanford.edu
Pseudocode No No structured pseudocode or algorithm blocks were found in the paper.
Open Source Code No No concrete access information (e.g., a specific repository link, explicit code release statement) for the source code of the methodology described in the paper was found.
Open Datasets Yes We use the MNIST dataset with fixed binarization for training and evaluation, which is common practice for evaluating stochastic gradient estimators (Salakhutdinov & Murray, 2008; Larochelle & Murray, 2011).
Dataset Splits No The paper mentions using a 'validation set' to select hyperparameters: 'we select the best learning rate for each estimator using the MNIST validation set, and report performance on the test set.' However, it does not provide specific details on the dataset splits (e.g., exact percentages or sample counts for the validation split) needed to reproduce the data partitioning.
Hardware Specification Yes Evaluations were performed on a GTX Titan X R GPU.
Software Dependencies No The paper mentions software components like 'backpropagation', 'stochastic gradient descent with momentum 0.9', 'sigmoid activation functions', 'softmax activations', and 'ReLU activations'. However, it does not provide specific version numbers for any programming languages, libraries, or frameworks used, which is necessary for reproducible software dependency information.
Experiment Setup Yes Learning rates are chosen from {3e-5, 1e-5, 3e-4, 1e-4, 3e-3, 1e-3}; we select the best learning rate for each estimator using the MNIST validation set, and report performance on the test set. The temperature is annealed using the schedule τ = max(0.5, exp(−rt)) of the global training step t, where τ is updated every N steps. N ∈ {500, 1000} and r ∈ {1e-5, 1e-4} are hyperparameters for which we select the best-performing estimator on the validation set and report test performance.