reproducibilityindex.ai

Measuring Sample Quality with Kernels

Authors: Jackson Gorham, Lester Mackey

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We next conduct an empirical evaluation of the KSD quality measures recommended by our theory, recording all timings on an Intel Xeon CPU E5-2650 v2 @ 2.60GHz. Throughout, we will refer to the KSD with IMQ base kernel k(x, y) = (c2 + kx yk2 2)β, exponent β = 1 2, and c = 1 as the IMQ KSD. Code reproducing all experiments can be found on the Julia (Bezanson et al., 2014) package site https://jgorham.github.io/ Stein Discrepancy.jl/.
Researcher Affiliation	Collaboration	1Stanford University, Palo Alto, CA USA 2Microsoft Research New England, Cambridge, MA USA.
Pseudocode	No	No structured pseudocode or algorithm blocks (e.g., a clearly labeled 'Algorithm' or 'Pseudocode' section) were found in the paper.
Open Source Code	Yes	Code reproducing all experiments can be found on the Julia (Bezanson et al., 2014) package site https://jgorham.github.io/ Stein Discrepancy.jl/.
Open Datasets	Yes	Specifically, we evaluate the SGFS-f and SGFS-d samples produced in (Ahn et al., 2012, Sec. 5.1). The target P is a Bayesian logistic regression with a ﬂat prior, conditioned on a dataset of 104 MNIST handwritten digit images.
Dataset Splits	No	The paper describes generating sample sequences (e.g., 'generated 50 independent approximate slice sampling chains') and evaluating their quality, but does not specify traditional train/validation/test dataset splits with percentages or counts as would be found in supervised learning.
Hardware Specification	Yes	recording all timings on an Intel Xeon CPU E5-2650 v2 @ 2.60GHz.
Software Dependencies	No	The paper mentions 'Julia (Bezanson et al., 2014)' as the platform for their code but does not list specific software dependencies or libraries with their version numbers required for reproduction.
Experiment Setup	Yes	For an array of values, we generated 50 independent approximate slice sampling chains with batch size 5, each with a budget of 148000 likelihood evaluations, and plotted the median IMQ KSD and effective sample size (ESS, a standard sample quality measure based on asymptotic variance (Brooks et al., 2011)) in Figure 3.