Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
On the Robustness of Kernel Goodness-of-Fit Tests
Authors: Xing Liu, François-Xavier Briol
JMLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We will now evaluate the proposed GOF tests using both synthetic and real data. |
| Researcher Affiliation | Collaboration | Xing Liu EMAIL Quant Co François-Xavier Briol EMAIL Department of Statistical Science University College London |
| Pseudocode | Yes | Algorithm 1 Robust-KSD (R-KSD) test for goodness-of-fit evaluation. |
| Open Source Code | Yes | Code for reproducing all experiments can be found at github.com/Xing LLiu/robust-kernel-test. |
| Open Datasets | Yes | We use the data set as Matsubara et al. (2022); Key et al. (2025), which is a 1-dimensional data set of 82 galaxy velocities (Postman et al., 1986; Roeder, 1990). |
| Dataset Splits | Yes | To avoid using the same data for model training and testing, we randomly split the data into equal halves, each containing ndata = 41 data points. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used, only mentioning execution times without specifying the processor, GPU, or memory. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers. |
| Experiment Setup | Yes | Unless otherwise mentioned, all standard KSD tests are based on an IMQ kernel k(x, x ) = h IMQ(x x ) where h IMQ(u) = (1 + u 2 2/λ2) 1/2 with a bandwidth λ2 > 0 selected via the median heuristic, i.e., λmed = Median Xi Xj 2 : 1 i < j n . All tilted-KSD and robust-KSD tests are based on a tilted IMQ kernel with weight w(x) = (1 + x a 2 2/c) b, where a Rd and c > 0. We fix a = 0 and c = 1 in all experiments, as all data will always be centered and on a suitable scale. More generally, we could replace x a 2 2/c by a weighted norm of the form (x a) C(x a), where C Rd d is a pre-conditioning matrix, chosen possibly as the empirical covariance matrix or robust estimates of it. Since our experiments will focus on sub-Gaussian models, we choose b = 1/2. This ensures the Stein kernel is bounded. All tests have nominal level α = 0.05. The probability of rejection is computed by averaging over 100 repetitions, and the 95% confidence intervals are reported. |