KSD Aggregated Goodness-of-fit Test
Authors: Antonin Schrab, Benjamin Guedj, Arthur Gretton
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We find on both synthetic and real-world data that KSDAGG outperforms other state-of-the-art quadratic-time adaptive KSD-based goodness-of-fit testing procedures. We discuss the implementation of KSDAGG and experimentally validate our proposed approach on benchmark problems, not only on datasets classically used in the literature but also on original data obtained using state-of-the-art generative models (i.e. Normalizing Flows). |
| Researcher Affiliation | Academia | Antonin Schrab Centre for Artificial Intelligence Gatsby Computational Neuroscience Unit University College London & Inria London a.schrab@ucl.ac.uk Benjamin Guedj Centre for Artificial Intelligence University College London & Inria London b.guedj@ucl.ac.uk Arthur Gretton Gatsby Computational Neuroscience Unit University College London arthur.gretton@gmail.com |
| Pseudocode | Yes | Algorithm 1 KSDAGG |
| Open Source Code | Yes | Contributing to the real-world applications of these goodness-of-fit tests, we provide publicly available code to allow practitioners to employ our method: https://github.com/antoninschrab/ksdagg-paper. |
| Open Datasets | Yes | MNIST dataset (Le Cun et al., 1998, 2010) |
| Dataset Splits | No | The paper uses various datasets (Gamma, GBRBM, MNIST Normalizing Flow) but does not explicitly provide details about train/validation/test splits for its experiments. For MNIST, it mentions a pre-trained model but not the experimental splits for the KSDAGG tests. |
| Hardware Specification | Yes | All experiments have been run on an AMD Ryzen Threadripper 3960X 24 Cores 128Gb RAM CPU at 3.8GHz |
| Software Dependencies | No | The paper mentions using third-party implementations ('Jitkrittum et al. (2017)' and 'Phillip Lippe’s implementation') but does not specify any software dependencies with version numbers for its own code or key libraries. |
| Experiment Setup | Yes | All our experiments are run with level = 0.05 using the IMQ kernel defined in Equation (7) with parameter βk = 0.5. We use a parametric bootstrap with B1 = B2 = 500 bootstrapped KSD values to compute the adjusted test thresholds, and B3 = 50 steps of bisection method to estimate the correction u in Equation (6). |