reproducibilityindex.ai

Evaluating the Variance of Likelihood-Ratio Gradient Estimators

Authors: Seiya Tokui, Issei Sato

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct experiments to empirically verify Theorem 1 and to demonstrate a procedure to analyze the optimal degree of a given estimator covered by our framework. We use MNIST (Lecun et al., 1998) and Omniglot (Lake et al., 2015) for our experiments.
Researcher Affiliation	Collaboration	1Preferred Networks, Tokyo, Japan 2The University of Tokyo, Tokyo, Japan 3RIKEN, Tokyo, Japan.
Pseudocode	Yes	Algorithm 1 Algorithm for RAM estimator (4) for discrete zi s. If zi is continuous, the loop over all the conﬁgurations of zi is replaced by a loop over integration points.
Open Source Code	No	The paper states 'All the methods are implemented with Chainer (Tokui et al., 2015)' but does not provide a link or explicit statement for the release of their own source code.
Open Datasets	Yes	We use MNIST (Lecun et al., 1998) and Omniglot (Lake et al., 2015) for our experiments.
Dataset Splits	Yes	For the MNIST dataset, we use the standard split of 60,000 training images and 10,000 test images. The training images are further split into 50,000 images and 10,000 images, the latter of which are used for validation. For the Omniglot dataset, we use the standard split of 24,345 training images and 8,070 test images used in the ofﬁcial implementation of Burda et al. (2015) 4. The training images are further split into 20,288 images and 4,057 images, the latter of which are used for validation.
Hardware Specification	Yes	Each experiment is done on an Intel(R) Xeon(R) CPU E52623 v3 at 3.00 GHz and an NVIDIA Ge Force Titan X.
Software Dependencies	No	The paper states 'All the methods are implemented with Chainer (Tokui et al., 2015)' but does not provide specific version numbers for Chainer or any other software dependencies.
Experiment Setup	Yes	We used RMSprop (Tieleman & Hinton, 2012) with a minibatch size of 100 to optimize the variational lower bound. We apply a weight decay of the coefﬁcient 0.001 for all parameters. All the weights are initialized with the method of Glorot & Bengio (2010). The learning rate is chosen from {3 10 4, 10 3, 3 10 3}.