Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Gradient Estimation with Discrete Stein Operators
Authors: Jiaxin Shi, Yuhao Zhou, Jessica Hwang, Michalis Titsias, Lester Mackey
NeurIPS 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate RODEO on 15 benchmark tasks, including training binary variational autoencoders (VAEs) with one or more stochastic layers. In most cases and with the same number of function evaluations, RODEO delivers lower variance and better training objectives than the state-of-the-art gradient estimators Dis ARM [14, 69], ARMS [13], Double CV [60], and RELAX [20]." and "Table 2: Training binary latent VAEs with K = 2, 3 (except for RELAX which uses 3 evaluations) on MNIST, Fashion-MNIST, and Omniglot. We report the average ELBO ( 1 standard error) on the training set after 1M steps over 5 independent runs. |
| Researcher Affiliation | Collaboration | Jiaxin Shi Stanford University EMAIL Yuhao Zhou Tsinghua University EMAIL Jessica Hwang Stanford University EMAIL Michalis K. Titsias Deep Mind EMAIL Lester Mackey Microsoft Research New England EMAIL |
| Pseudocode | Yes | Algorithm 1 Optimizing Eq [f (x)] with RODEO gradients |
| Open Source Code | Yes | Python code replicating all experiments can be found at https://github.com/thjashin/rodeo. |
| Open Datasets | Yes | We consider the MNIST [33], Fashion-MNIST [66] and Omniglot [32] datasets using their standard train, validation, and test splits. |
| Dataset Splits | Yes | We consider the MNIST [33], Fashion-MNIST [66] and Omniglot [32] datasets using their standard train, validation, and test splits. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper mentions 'Python code' but does not provide specific version numbers for Python or any key software libraries and dependencies used in the experiments. |
| Experiment Setup | Yes | The VAE architecture and training experimental setup follows Titsias and Shi [60], and details are given in Appendix D." and "The functions H (13) and H (14) share a neural network architecture with two output units and a single hidden layer with 100 units." and "We report the average ELBO ( 1 standard error) on the training set after 1M steps over 5 independent runs. |