A Spectral Approach to Gradient Estimation for Implicit Distributions

Authors: Jiaxin Shi, Shengyang Sun, Jun Zhu

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the proposed approach on both toy problems and real-world examples. The latter includes applications of SSGE to two widely used inference methods: Hamiltonian Monte Carlo and variational inference. ... In Figure 1 we plot the gradients estimates produced by the Stein gradient estimator, its out-of-sample extension (Stein+) (see Section 2.2), and our approach (SSGE). ... The average acceptance ratios over 10 runs are plotted in Figure 2b. We can see that SSGE clearly outperforms Stein+ and is even better than the KMC algorithm...
Researcher Affiliation Academia 1Dept. of Comp. Sci. & Tech., BNRist Center, State Key Lab for Intell. Tech. & Sys., THBI Lab, Tsinghua University 2Dept. of Comp. Sci., University of Toronto. Correspondence to: Jiaxin Shi <shijx15@mails.tsinghua.edu.cn>, Jun Zhu <dcszj@tsinghua.edu.cn>.
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes Code is available at https://github.com/thjashin/ spectral-stein-grad.
Open Datasets Yes We follow the settings in Sejdinovic et al. (2014); Strathmann et al. (2015) and consider a Gaussian Process classification problem on the UCI Glass dataset. ... we adopt the settings in Shi et al. (2018) and train a deep convolutional VAE with implicit variational posteriors (Implicit VAE for short) on the Celeb A dataset. ... Besides the Celeb A experiments, we also tested the models on MNIST dataset and evaluated the test log likelihoods.
Dataset Splits No The paper does not explicitly state specific dataset split information (e.g., percentages or sample counts) for training, validation, or test sets.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments are provided in the paper.
Software Dependencies No The paper mentions 'Implementations are based on Zhu Suan (Shi et al., 2017)' but does not provide specific version numbers for software dependencies like programming languages, libraries, or frameworks.
Experiment Setup Yes For the regularization coefficient η in eq. (10), we searched it in {0.001, 0.01, 0.1, 1, 10, 100} and plot the best result at η = 0.1. For SSGE, we set J = 6. ... For SSGE, we set J = 100. ... All other methods are trained with 100 samples for 20k iterations using Adam optimizer (Kingma & Ba, 2014). For SSGE, we set M = 100, and r = 0.99. ... we randomly uses between 1 and 10 leapfrog steps of size chosen uniformly in [0.01, 0.1], and a standard Gaussian momentum.