GO Gradient for Expectation-Based Objectives
Authors: Yulai Cong, Miaoyun Zhao, Ke Bai, Lawrence Carin
ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We examine the proposed GO gradients and statistical back-propagation with four experiments: (i) simple one-dimensional (gamma and negative binomial) examples are presented to verify the GO gradient in Theorem 1, corresponding to nonnegative and discrete random variables; (ii) the discrete variational autoencoder experiment from Tucker et al. (2017) and Grathwohl et al. (2017) is reproduced to compare GO with the state-of-the-art variance-reduction methods; (iii) a multinomial GAN, generating discrete observations, is constructed to demonstrate the deep GO gradient in Theorem 2; (iv) hierarchical variational inference (HVI) for two deep non-conjugate Bayesian models is developed to verify statistical back-propagation in Theorem 3. |
| Researcher Affiliation | Academia | Yulai Cong Miaoyun Zhao Ke Bai Lawrence Carin Department of Electrical and Computer Engineering, Duke University |
| Pseudocode | Yes | Algorithm 1 An algorithm for (34) as an example to demonstrate how to practically cooperate GO gradients with deep learning frameworks like Tensor Flow or Py Torch. |
| Open Source Code | Yes | Code for all experiments can be found at github.com/Yulai Cong/GOgradient. |
| Open Datasets | Yes | Dataset Model Training Validation MNIST...Omniglot... |
| Dataset Splits | Yes | Table 1: Best obtained ELBOs for discrete variational autoencoders. Results of REBAR and RELAX are obtained by running the released code4 from Grathwohl et al. (2017). All methods are run with the same learning rate for 1, 000, 000 iterations. Dataset Model Training Validation... ELBOs are calculated using all training/validation data. |
| Hardware Specification | Yes | Experiments presented below were implemented in Tensor Flow or Py Torch with a Titan Xp GPU. |
| Software Dependencies | No | The paper mentions 'Tensor Flow (Abadi et al.)' and 'Py Torch (Paszke et al., 2017)' as frameworks used, but it does not specify exact version numbers for these or any other software dependencies, which is required for reproducibility. |
| Experiment Setup | Yes | Stochastic gradient ascent with one-sample-estimated gradients is used to optimize the objective...All methods are run with the same learning rate for 1, 000, 000 iterations...The mini-batch size is set to 200. One-sample gradient estimates are used to train the model for the compared methods. For RSVI (Naesseth et al., 2016), the shape augmentation parameter B is set to 5...Let z(l) Tz, where Tz = 1e 5 is used in the experiments; Let c(l) Tc, where Tc = 1e 5; Let Φ(l+1)z(l+1) Tα with Tα = 0.2; Use a factor to compromise the likelihood and prior for each z(l). |