reproducibilityindex.ai

ARM: Augment-REINFORCE-Merge Gradient for Stochastic Binary Networks

Authors: Mingzhang Yin, Mingyuan Zhou

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show the ARM estimator provides state-of-the-art performance in auto-encoding variational inference and maximum likelihood estimation, for discrete latent variable models with one or multiple stochastic binary layers.
Researcher Affiliation	Academia	Mingzhang Yin Department of Statistics and Data Sciences The University of Texas at Austin Austin, TX 78712 mzyin@utexas.edu Mingyuan Zhou Department of IROM, Mc Combs School of Business The University of Texas at Austin Austin, TX 78712 mingyuan.zhou@mccombs.utexas.edu
Pseudocode	Yes	Algorithm 1: ARM gradient for a V -dimensional binary latent vector; Algorithm 2: ARM gradient for a T-stochastic-hidden-layer binary network
Open Source Code	Yes	Python code for reproducible research is available at https://github.com/mingzhang-yin/ARM-gradient.
Open Datasets	Yes	We consider a widely used binarization (Salakhutdinov & Murray, 2008; Larochelle & Murray, 2011; Yin & Zhou, 2018), referred to as MNIST-static and available at http://www.dmi.usherb.ca/ larocheh/mlpython/ modules/datasets/binarized mnist.html; In addition to MNIST-static, we also consider MNIST-threshold (van den Oord et al., 2017), which binarizes MNIST by thresholding each pixel value at 0.5, and the binarized OMNIGLOT dataset.
Dataset Splits	Yes	For each dataset, using its default training/validation/testing partition, we train all methods on the training set, calculate the validation log-likelihood for every epoch, and report the test negative log-likelihood when the validation negative log-likelihood reaches its minimum within a predeﬁned maximum number of iterations.
Hardware Specification	Yes	The authors acknowledge the support of NVIDIA Corporation with the donation of the Titan Xp GPU used for this research, and the computational support of Texas Advanced Computing Center.
Software Dependencies	No	The paper mentions using "Adam" but does not provide specific version numbers for any software components or libraries.
Experiment Setup	Yes	We maximize a single-Monte-Carlo-sample ELBO using Adam (Kingma & Ba, 2014), with the learning rate selected from {5, 1, 0.5} 10 4 by the validation set. We set the batch size as 50 for MNIST and 25 for OMNIGLOT.