Measuring Sample Quality with Stein's Method
Authors: Jackson Gorham, Lester Mackey
NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We now turn to an empirical evaluation of our proposed quality measures. We compute all spanners using the efficient C++ greedy spanner implementation of Bouts et al. [19] and solve all optimization programs using Julia for Mathematical Programming [20] with the default Gurobi 6.0.4 solver [21]. All reported timings are obtained using a single core of an Intel Xeon CPU E5-2650 v2 @ 2.60GHz. 5.1 A Simple Example We begin with a simple example to illuminate a few properties of the Stein diagnostic. For the target P = N(0, 1), we generate a sequence of sample points i.i.d. from the target and a second sequence i.i.d. from a scaled Student s t distribution with matching variance and 10 degrees of freedom. The left panel of Figure 1 shows that the complete graph Stein discrepancy applied to the first n Gaussian sample points decays to zero at an n 0.52 rate, while the discrepancy applied to the scaled Student s t sample remains bounded away from zero. |
| Researcher Affiliation | Academia | Jackson Gorham Department of Statistics Stanford University Lester Mackey Department of Statistics Stanford University |
| Pseudocode | Yes | Algorithm 1 Multivariate Spanner Stein Discrepancy Algorithm 2 Univariate Complete Graph Stein Discrepancy |
| Open Source Code | No | The paper mentions using third-party open-source tools like "C++ greedy spanner implementation of Bouts et al. [19]" and "Julia for Mathematical Programming [20] with the default Gurobi 6.0.4 solver [21]", but does not state that the authors' own developed methodology code is open-source or provide a link to it. |
| Open Datasets | No | The paper refers to target distributions like N(0,1) or Unif(0,1), or uses datasets referenced by a paper citation for context (e.g., "bimodal Gaussian mixture model (GMM) posterior of [3]", "dataset of 53 prostate cancer patients... [24]"), but it does not provide concrete access information (like a direct URL, DOI, or repository) for these datasets, nor does it explicitly state they are publicly available with proper attribution. |
| Dataset Splits | No | The paper does not specify exact train/validation/test dataset splits, percentages, or absolute sample counts for reproducibility. It discusses sample sizes (e.g., "sequences of length n = 1000") but not data partitioning for training, validation, and testing. |
| Hardware Specification | Yes | All reported timings are obtained using a single core of an Intel Xeon CPU E5-2650 v2 @ 2.60GHz. |
| Software Dependencies | Yes | All reported timings are obtained using a single core of an Intel Xeon CPU E5-2650 v2 @ 2.60GHz. We compute all spanners using the efficient C++ greedy spanner implementation of Bouts et al. [19] and solve all optimization programs using Julia for Mathematical Programming [20] with the default Gurobi 6.0.4 solver [21]. |
| Experiment Setup | Yes | For a range of step sizes ε, we use SGLD with minibatch size 5 to draw 50 independent sequences of length n = 1000, and we select the value of ε with the highest median quality either the maximum effective sample size (ESS, a standard diagnostic based on autocorrelation [1]) or the minimum spanner Stein discrepancy across these sequences. |