Goodness-of-Fit Testing for Discrete Distributions via Stein Discrepancy
Authors: Jiasen Yang, Qiang Liu, Vinayak Rao, Jennifer Neville
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We apply the proposed goodness-of-fit test to three statistical models involving discrete distributions, and our experiments show that the proposed test typically outperforms a two-sample test based on the maximum mean discrepancy. |
| Researcher Affiliation | Academia | 1Department of Statistics, Purdue University, West Lafayette, IN 2Department of Computer Science, The University of Texas at Austin, Austin, TX 3Department of Computer Science, Purdue University, West Lafayette, IN. |
| Pseudocode | Yes | Algorithm 1 Goodness-of-fit testing via KDSD |
| Open Source Code | No | The paper does not explicitly state that its source code for the methodology is released or provide a link to it. |
| Open Datasets | No | The paper describes generating samples from models (Ising, Bernoulli RBM, ERGM) and mentions using 'ergm R package (Handcock et al., 2017)' for ERGM, but does not specify a publicly available or open dataset that is used for training. |
| Dataset Splits | No | The paper describes drawing samples for hypothesis testing (n samples from q for KDSD, and n from q and n from p for MMD) but does not specify training, validation, or test dataset splits. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory) used for running its experiments. |
| Software Dependencies | Yes | We utilize the ergm R package (Handcock et al., 2017). R package version 3.8.0. |
| Experiment Setup | Yes | We set m = 5000 for both methods throughout. ... significance level α = 0.05. ... We consider a periodic 10-by-10 lattice, with d = 100 random variables. We focus on the ferromagnetic setting and set θij = 1/T, where T is the temperature of the system. ... We use M = 50 visible units and K = 25 hidden units. We draw the entries of the weight matrix W i.i.d. from a Normal distribution with mean zero and standard deviation 1/M, and the entries of the bias terms b and c i.i.d. from the standard Normal distribution. ... We consider an ERGM distribution for undirected graphs on 20 nodes, with the dimension of each sample d = 20 2 = 190. We fix θ1 = 2 and τ = 0.01. |