reproducibilityindex.ai

Using Perturbation to Improve Goodness-of-Fit Tests based on Kernelized Stein Discrepancy

Authors: Xing Liu, Andrew B. Duncan, Axel Gandy

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show theoretically and empirically that the KSD test can suffer from low power when the target and the alternative distributions have the same well-separated modes but differ in mixing proportions. We provide numerical evidence that with suitably chosen transition kernels the proposed approach can lead to substantially higher power than the KSD test. and 7. Experiments section.
Researcher Affiliation	Academia	1Department of Mathematics, Imperial College London, London, UK. 2Alan Turing Institute, London, UK.
Pseudocode	Yes	Algorithm 1 Goodness-of-Fit Test with sp KSD.
Open Source Code	Yes	Code for reproducing all experiments can be found at github.com/Xing LLiu/pksd.
Open Datasets	No	The paper describes generating samples for various experiments (e.g., Gaussian mixture, t and banana distributions, sensor network localisation, GB-RBM) and cites related work (Pompe et al. (2020), Tak et al. (2018), Cho et al. (2013)), but does not provide explicit access links, DOIs, repositories, or formal citations with author/year for the specific datasets used to run the experiments.
Dataset Splits	No	The paper mentions drawing samples and sample sizes (e.g., 'All samples have size n = 1000'), but does not specify explicit training, validation, or test dataset splits (percentages, counts, or references to predefined splits).
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU models, CPU types, or memory used for running the experiments.
Software Dependencies	No	The paper mentions the use of the 'BFGS algorithm' and 'IMQ kernel' but does not specify any software names with version numbers for libraries or tools used in the experiments.
Experiment Setup	Yes	All samples have size n = 1000. All experiments are run with level α = 0.05 using the IMQ kernel k(x, y) = (1 + x y 2 2/λ) 1/2, where λ is chosen to be mediani<j{ xi xj 2 2}. The probability of rejecting the null hypothesis is estimated by averaging the test output over 100 repetitions, except in the sensors location example, which is repeated 10 times. The number of transition steps T is selected to be 10 for the Gaussian mixture example, 100 for the mixture of t and banana distributions example, and 1000 for the sensor network localisation.