A Spectral Approach to Gradient Estimation for Implicit Distributions
Authors: Jiaxin Shi, Shengyang Sun, Jun Zhu
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the proposed approach on both toy problems and real-world examples. The latter includes applications of SSGE to two widely used inference methods: Hamiltonian Monte Carlo and variational inference. ... In Figure 1 we plot the gradients estimates produced by the Stein gradient estimator, its out-of-sample extension (Stein+) (see Section 2.2), and our approach (SSGE). ... The average acceptance ratios over 10 runs are plotted in Figure 2b. We can see that SSGE clearly outperforms Stein+ and is even better than the KMC algorithm... |
| Researcher Affiliation | Academia | 1Dept. of Comp. Sci. & Tech., BNRist Center, State Key Lab for Intell. Tech. & Sys., THBI Lab, Tsinghua University 2Dept. of Comp. Sci., University of Toronto. Correspondence to: Jiaxin Shi <shijx15@mails.tsinghua.edu.cn>, Jun Zhu <dcszj@tsinghua.edu.cn>. |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at https://github.com/thjashin/ spectral-stein-grad. |
| Open Datasets | Yes | We follow the settings in Sejdinovic et al. (2014); Strathmann et al. (2015) and consider a Gaussian Process classification problem on the UCI Glass dataset. ... we adopt the settings in Shi et al. (2018) and train a deep convolutional VAE with implicit variational posteriors (Implicit VAE for short) on the Celeb A dataset. ... Besides the Celeb A experiments, we also tested the models on MNIST dataset and evaluated the test log likelihoods. |
| Dataset Splits | No | The paper does not explicitly state specific dataset split information (e.g., percentages or sample counts) for training, validation, or test sets. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments are provided in the paper. |
| Software Dependencies | No | The paper mentions 'Implementations are based on Zhu Suan (Shi et al., 2017)' but does not provide specific version numbers for software dependencies like programming languages, libraries, or frameworks. |
| Experiment Setup | Yes | For the regularization coefficient η in eq. (10), we searched it in {0.001, 0.01, 0.1, 1, 10, 100} and plot the best result at η = 0.1. For SSGE, we set J = 6. ... For SSGE, we set J = 100. ... All other methods are trained with 100 samples for 20k iterations using Adam optimizer (Kingma & Ba, 2014). For SSGE, we set M = 100, and r = 0.99. ... we randomly uses between 1 and 10 leapfrog steps of size chosen uniformly in [0.01, 0.1], and a standard Gaussian momentum. |