Sliced Kernelized Stein Discrepancy
Authors: Wenbo Gong, Yingzhen Li, José Miguel Hernández-Lobato
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | extensive experiments show the proposed discrepancy significantly outperforms KSD and various baselines in high dimensions. For model learning, we show its advantages over existing Stein discrepancy baselines by training independent component analysis models with different discrepancies. We further propose a novel particle inference method called sliced Stein variational gradient descent (S-SVGD) which alleviates the mode-collapse issue of SVGD in training variational autoencoders. |
| Researcher Affiliation | Collaboration | Wenbo Gong University of Cambridge wg242@cam.ac.uk Yingzhen Li Imperial College London yingzhen.li@imperial.ac.uk José Miguel Hernández-Lobato University of Cambridge The Alan Turing Institute jmh233@cam.ac.uk Work done at Microsoft Research Cambridge |
| Pseudocode | Yes | Algorithm 1: GOF Test with max SKSD U-statistics |
| Open Source Code | No | The paper does not provide an explicit statement or link for the open-source code of the described methodology. |
| Open Datasets | Yes | Gaussian GOF benchmarks (Jitkrittum et al., 2017; Huggins & Mackey, 2018; Chwialkowski et al., 2016). RBM (Liu et al., 2016; Huggins & Mackey, 2018; Jitkrittum et al., 2017). binarized MNIST. UCI datasets (Dua & Graff, 2017). |
| Dataset Splits | No | No explicit percentages, sample counts, or detailed methodology for train/validation/test splits are provided across all experiments. For binarized MNIST, it mentions 'first 5,000 test images' but no details on training/validation splits. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper mentions 'Adam' as an optimizer but does not provide version numbers for any software dependencies or libraries. |
| Experiment Setup | Yes | For the testing setup, we set the significance level α = 0.05. For ICA, 'data sampled from a randomly initialized ICA model'. For VAEs, 'The decoder is trained as in vanilla VAEs, but the encoder is trained by amortization'. It also states, 'For fair comparisons, we do not tune the coefficient of the repulsive force.' However, detailed hyperparameters like learning rates, batch sizes, or specific training schedules are not explicitly provided in the main text. |