Low Bias Low Variance Gradient Estimates for Boolean Stochastic Networks

Authors: Adeel Pervez, Taco Cohen, Efstratios Gavves

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments show that Fou ST performs favorably compared to state-of-the-art biased estimators and is much faster than unbiased ones. 6. Experiments Experimental Setup. We first validate Fou ST on a toy setup with a known analytic expression for f(z). Next, we validate Fou ST by training generative models using the variational autoencoder framework of Kingma & Welling (2014). We optimize the single sample variational lower bound (ELBO) of the log-likelihood. We train variational autoencoders exclusively with Boolean latent variables on OMNIGLOT, CIFAR10, mini-Image Net (Vinyals et al., 2016) and MNIST (see the appendix).
Researcher Affiliation Collaboration 1QUVA Lab, Informatics Institute, University of Amsterdam, The Netherlands 2Qualcomm AI Research, Qualcomm Technologies Netherlands B.V., The Netherlands. Correspondence to: Adeel Pervez <a.a.pervez@uva.nl>.
Pseudocode Yes We summarize Fou ST in Algorithm 1. Algorithm 1 Fou ST Gradient Estimator
Open Source Code No The paper does not contain any explicit statement about releasing open-source code or provide a link to a code repository.
Open Datasets Yes We train variational autoencoders exclusively with Boolean latent variables on OMNIGLOT, CIFAR10, mini-Image Net (Vinyals et al., 2016) and MNIST (see the appendix).
Dataset Splits No The paper mentions 'Validation bits per dimension for various models on CIFAR-10' and discusses 'Training ELBO curves'. It also refers to 'test ELBO' values for MNIST. However, it does not explicitly provide specific percentages, sample counts, or citations to predefined splits for training, validation, or test sets across all datasets used.
Hardware Specification No The paper mentions 'Training a nonlinear sigmoid belief network model on GPU with two stochastic layers on MNIST with REBAR took 1.5 days.' However, it only states 'on GPU' without providing specific details such as the GPU model, CPU type, or memory specifications.
Software Dependencies No The paper mentions using a 'variational autoencoder framework' and references 'Kingma & Welling (2014)', but it does not specify any software dependencies with version numbers (e.g., Python 3.x, TensorFlow x.x, PyTorch x.x).
Experiment Setup No The paper states 'All hyperparameters remain fixed throughout the training.' and 'All estimators in this section use one sample per example and a single decoder evaluation.' It also notes 'Details regarding architectures and hyperparameters are in the appendix.', implying they are not present in the main text.