Gradient Estimation with Stochastic Softmax Tricks
Authors: Max Paulus, Dami Choi, Daniel Tarlow, Andreas Krause, Chris J. Maddison
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our goal in these experiments was to evaluate the use of SSTs for learning distributions over structured latent spaces in deep structured models. |
| Researcher Affiliation | Collaboration | Max B. Paulus ETH Zürich max.paulus@inf.ethz.ch Dami Choi University of Toronto choidami@cs.toronto.edu Daniel Tarlow Google Research, Brain Team dtarlow@google.com Andreas Krause ETH Zürich krausea@ethz.ch Chris J. Maddison University of Toronto & Deep Mind cmaddis@cs.toronto.edu |
| Pseudocode | No | The paper describes methods and algorithms textually but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at https://github.com/choidami/sst. |
| Open Datasets | Yes | We used a simplified variant of the List Ops dataset [62], which contains sequences of prefix arithmetic expressions, e.g., max[ 3 min[ 8 2 ]], that evaluate to an integer in [0, 9]. |
| Dataset Splits | No | We selected models on a validation set according to the best objective value obtained during training. All reported values are measured on a test set. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory, or cloud instance types) used for running the experiments. |
| Software Dependencies | No | The paper mentions software packages like TensorFlow, PyTorch, and JAX as modern software packages used, but it does not provide specific version numbers for these or any other ancillary software dependencies. |
| Experiment Setup | Yes | We optimized hyperparameters (including fixed training temperature t) using random search over multiple independent runs. We selected models on a validation set according to the best objective value obtained during training. |