Low Bias Low Variance Gradient Estimates for Boolean Stochastic Networks
Authors: Adeel Pervez, Taco Cohen, Efstratios Gavves
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments show that Fou ST performs favorably compared to state-of-the-art biased estimators and is much faster than unbiased ones. 6. Experiments Experimental Setup. We first validate Fou ST on a toy setup with a known analytic expression for f(z). Next, we validate Fou ST by training generative models using the variational autoencoder framework of Kingma & Welling (2014). We optimize the single sample variational lower bound (ELBO) of the log-likelihood. We train variational autoencoders exclusively with Boolean latent variables on OMNIGLOT, CIFAR10, mini-Image Net (Vinyals et al., 2016) and MNIST (see the appendix). |
| Researcher Affiliation | Collaboration | 1QUVA Lab, Informatics Institute, University of Amsterdam, The Netherlands 2Qualcomm AI Research, Qualcomm Technologies Netherlands B.V., The Netherlands. Correspondence to: Adeel Pervez <a.a.pervez@uva.nl>. |
| Pseudocode | Yes | We summarize Fou ST in Algorithm 1. Algorithm 1 Fou ST Gradient Estimator |
| Open Source Code | No | The paper does not contain any explicit statement about releasing open-source code or provide a link to a code repository. |
| Open Datasets | Yes | We train variational autoencoders exclusively with Boolean latent variables on OMNIGLOT, CIFAR10, mini-Image Net (Vinyals et al., 2016) and MNIST (see the appendix). |
| Dataset Splits | No | The paper mentions 'Validation bits per dimension for various models on CIFAR-10' and discusses 'Training ELBO curves'. It also refers to 'test ELBO' values for MNIST. However, it does not explicitly provide specific percentages, sample counts, or citations to predefined splits for training, validation, or test sets across all datasets used. |
| Hardware Specification | No | The paper mentions 'Training a nonlinear sigmoid belief network model on GPU with two stochastic layers on MNIST with REBAR took 1.5 days.' However, it only states 'on GPU' without providing specific details such as the GPU model, CPU type, or memory specifications. |
| Software Dependencies | No | The paper mentions using a 'variational autoencoder framework' and references 'Kingma & Welling (2014)', but it does not specify any software dependencies with version numbers (e.g., Python 3.x, TensorFlow x.x, PyTorch x.x). |
| Experiment Setup | No | The paper states 'All hyperparameters remain fixed throughout the training.' and 'All estimators in this section use one sample per example and a single decoder evaluation.' It also notes 'Details regarding architectures and hyperparameters are in the appendix.', implying they are not present in the main text. |