reproducibilityindex.ai

Measuring Sample Quality with Stein's Method

Authors: Jackson Gorham, Lester Mackey

NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We now turn to an empirical evaluation of our proposed quality measures. We compute all spanners using the efﬁcient C++ greedy spanner implementation of Bouts et al. [19] and solve all optimization programs using Julia for Mathematical Programming [20] with the default Gurobi 6.0.4 solver [21]. All reported timings are obtained using a single core of an Intel Xeon CPU E5-2650 v2 @ 2.60GHz. 5.1 A Simple Example We begin with a simple example to illuminate a few properties of the Stein diagnostic. For the target P = N(0, 1), we generate a sequence of sample points i.i.d. from the target and a second sequence i.i.d. from a scaled Student s t distribution with matching variance and 10 degrees of freedom. The left panel of Figure 1 shows that the complete graph Stein discrepancy applied to the ﬁrst n Gaussian sample points decays to zero at an n 0.52 rate, while the discrepancy applied to the scaled Student s t sample remains bounded away from zero.
Researcher Affiliation	Academia	Jackson Gorham Department of Statistics Stanford University Lester Mackey Department of Statistics Stanford University
Pseudocode	Yes	Algorithm 1 Multivariate Spanner Stein Discrepancy Algorithm 2 Univariate Complete Graph Stein Discrepancy
Open Source Code	No	The paper mentions using third-party open-source tools like "C++ greedy spanner implementation of Bouts et al. [19]" and "Julia for Mathematical Programming [20] with the default Gurobi 6.0.4 solver [21]", but does not state that the authors' own developed methodology code is open-source or provide a link to it.
Open Datasets	No	The paper refers to target distributions like N(0,1) or Unif(0,1), or uses datasets referenced by a paper citation for context (e.g., "bimodal Gaussian mixture model (GMM) posterior of [3]", "dataset of 53 prostate cancer patients... [24]"), but it does not provide concrete access information (like a direct URL, DOI, or repository) for these datasets, nor does it explicitly state they are publicly available with proper attribution.
Dataset Splits	No	The paper does not specify exact train/validation/test dataset splits, percentages, or absolute sample counts for reproducibility. It discusses sample sizes (e.g., "sequences of length n = 1000") but not data partitioning for training, validation, and testing.
Hardware Specification	Yes	All reported timings are obtained using a single core of an Intel Xeon CPU E5-2650 v2 @ 2.60GHz.
Software Dependencies	Yes	All reported timings are obtained using a single core of an Intel Xeon CPU E5-2650 v2 @ 2.60GHz. We compute all spanners using the efﬁcient C++ greedy spanner implementation of Bouts et al. [19] and solve all optimization programs using Julia for Mathematical Programming [20] with the default Gurobi 6.0.4 solver [21].
Experiment Setup	Yes	For a range of step sizes ε, we use SGLD with minibatch size 5 to draw 50 independent sequences of length n = 1000, and we select the value of ε with the highest median quality either the maximum effective sample size (ESS, a standard diagnostic based on autocorrelation [1]) or the minimum spanner Stein discrepancy across these sequences.