Measuring Sample Quality with Kernels

Authors: Jackson Gorham, Lester Mackey

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We next conduct an empirical evaluation of the KSD quality measures recommended by our theory, recording all timings on an Intel Xeon CPU E5-2650 v2 @ 2.60GHz. Throughout, we will refer to the KSD with IMQ base kernel k(x, y) = (c2 + kx yk2 2)β, exponent β = 1 2, and c = 1 as the IMQ KSD. Code reproducing all experiments can be found on the Julia (Bezanson et al., 2014) package site https://jgorham.github.io/ Stein Discrepancy.jl/.
Researcher Affiliation Collaboration 1Stanford University, Palo Alto, CA USA 2Microsoft Research New England, Cambridge, MA USA.
Pseudocode No No structured pseudocode or algorithm blocks (e.g., a clearly labeled 'Algorithm' or 'Pseudocode' section) were found in the paper.
Open Source Code Yes Code reproducing all experiments can be found on the Julia (Bezanson et al., 2014) package site https://jgorham.github.io/ Stein Discrepancy.jl/.
Open Datasets Yes Specifically, we evaluate the SGFS-f and SGFS-d samples produced in (Ahn et al., 2012, Sec. 5.1). The target P is a Bayesian logistic regression with a flat prior, conditioned on a dataset of 104 MNIST handwritten digit images.
Dataset Splits No The paper describes generating sample sequences (e.g., 'generated 50 independent approximate slice sampling chains') and evaluating their quality, but does not specify traditional train/validation/test dataset splits with percentages or counts as would be found in supervised learning.
Hardware Specification Yes recording all timings on an Intel Xeon CPU E5-2650 v2 @ 2.60GHz.
Software Dependencies No The paper mentions 'Julia (Bezanson et al., 2014)' as the platform for their code but does not list specific software dependencies or libraries with their version numbers required for reproduction.
Experiment Setup Yes For an array of values, we generated 50 independent approximate slice sampling chains with batch size 5, each with a budget of 148000 likelihood evaluations, and plotted the median IMQ KSD and effective sample size (ESS, a standard sample quality measure based on asymptotic variance (Brooks et al., 2011)) in Figure 3.