reproducibilityindex.ai

Gradient Estimation with Discrete Stein Operators

Authors: Jiaxin Shi, Yuhao Zhou, Jessica Hwang, Michalis Titsias, Lester Mackey

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate RODEO on 15 benchmark tasks, including training binary variational autoencoders (VAEs) with one or more stochastic layers. In most cases and with the same number of function evaluations, RODEO delivers lower variance and better training objectives than the state-of-the-art gradient estimators Dis ARM [14, 69], ARMS [13], Double CV [60], and RELAX [20]." and "Table 2: Training binary latent VAEs with K = 2, 3 (except for RELAX which uses 3 evaluations) on MNIST, Fashion-MNIST, and Omniglot. We report the average ELBO ( 1 standard error) on the training set after 1M steps over 5 independent runs.
Researcher Affiliation	Collaboration	Jiaxin Shi Stanford University jiaxins@stanford.edu Yuhao Zhou Tsinghua University yuhaoz.cs@gmail.com Jessica Hwang Stanford University jjhwang@stanford.edu Michalis K. Titsias Deep Mind mtitsias@google.com Lester Mackey Microsoft Research New England lmackey@microsoft.com
Pseudocode	Yes	Algorithm 1 Optimizing Eq [f (x)] with RODEO gradients
Open Source Code	Yes	Python code replicating all experiments can be found at https://github.com/thjashin/rodeo.
Open Datasets	Yes	We consider the MNIST [33], Fashion-MNIST [66] and Omniglot [32] datasets using their standard train, validation, and test splits.
Dataset Splits	Yes	We consider the MNIST [33], Fashion-MNIST [66] and Omniglot [32] datasets using their standard train, validation, and test splits.
Hardware Specification	No	The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies	No	The paper mentions 'Python code' but does not provide specific version numbers for Python or any key software libraries and dependencies used in the experiments.
Experiment Setup	Yes	The VAE architecture and training experimental setup follows Titsias and Shi [60], and details are given in Appendix D." and "The functions H (13) and H (14) share a neural network architecture with two output units and a single hidden layer with 100 units." and "We report the average ELBO ( 1 standard error) on the training set after 1M steps over 5 independent runs.