Message Passing Stein Variational Gradient Descent

Authors: Jingwei Zhuo, Chang Liu, Jiaxin Shi, Jun Zhu, Ning Chen, Bo Zhang

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our theoretical analysis finds out that there exists a negative correlation between the dimensionality and the repulsive force of SVGD which should be blamed for this phenomenon. We propose Message Passing SVGD (MP-SVGD) to solve this problem. By leveraging the conditional independence structure of probabilistic graphical models (PGMs), MP-SVGD converts the original highdimensional global inference problem into a set of local ones over the Markov blanket with lower dimensions. Experimental results show its advantages of preventing vanishing repulsive force in high-dimensional space over SVGD, and its particle efficiency and approximation flexibility over other inference methods on graphical models.
Researcher Affiliation Academia Jingwei Zhuo 1 Chang Liu 1 Jiaxin Shi 1 Jun Zhu 1 Ning Chen 1 Bo Zhang 1 1Dept. of Comp. Sci. & Tech., BNRist Center, State Key Lab for Intell. Tech. & Sys., THBI Lab, Tsinghua University, Beijing, 100084, China. Correspondence to: Jingwei Zhuo <zjw15@mails.tsinghua.edu.cn>, Jun Zhu <dcszj@tsinghua.edu.cn>.
Pseudocode Yes Algorithm 1 Message Passing SVGD
Open Source Code No The paper does not provide concrete access to source code (no specific repository link, explicit code release statement, or mention of code in supplementary materials) for the methodology described in this paper.
Open Datasets Yes We follow the settings of (Lienart et al., 2015) and focus on a pairwise MRF on the 2D grid... We run 100 chains in parallel with 40,000 samples for each chain after 10,000 burned-in, i.e. 4 million samples in total. ... We use the Gaussian distribution as the factors, and the moment matching step is done by numerical integration due to the non-Gaussian nature of p(x). EPBP is a variant of BP methods and the original state-of-the-art method on this task. It uses weighted samples to estimate the messages while other methods (except EP) use unweighted samples to approximate p(x) directly. ... We focus on the pairwise MRF where F indexes all the edge factors, Ji = [1, 1] , N = 1 and J = 15. All the parameters (i.e., ϵ, Ji, σi and sj) are pre-learned and details can be found in (Schmidt et al., 2010). ... Table 1: Denoising results for 10 test images (Lan et al., 2006) from BSD dataset (Martin et al., 2001).
Dataset Splits No The paper mentions generating ground truth data and using test images from public datasets, but it does not provide specific dataset split information (exact percentages, sample counts, or detailed splitting methodology) needed to reproduce the data partitioning for training, validation, or testing.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper mentions using a library for Bayesian deep learning (Zhusuan: A library for bayesian deep learning. ar Xiv preprint ar Xiv:1709.05870, 2017) in the references, but it does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers like Python 3.8, PyTorch 1.9) needed to replicate the experiment.
Experiment Setup Yes We use the RBF kernel with the bandwidth chosen by the median heuristic for all experiments. ... For EP, we use the Gaussian distribution as the factors, and the moment matching step is done by numerical integration due to the non-Gaussian nature of p(x). ... Parameters α1 and α2 are set to 0.6 and 0.4. ... We consider a 10 x 10 grid except Fig. 5, whose grid size ranges from 2 x 2 to 10 x 10. All experimental results are averaged over 10 runs with random initializations. ... We run 100 chains in parallel with 40,000 samples for each chain after 10,000 burned-in, i.e. 4 million samples in total. ... We compare SVGD and MP-SVGD with Gibbs sampling with auxiliary variables (Aux. Gibbs)...