reproducibilityindex.ai

Learning Kernel Tests Without Data Splitting

Authors: Jonas Kübler, Wittawat Jitkrittum, Bernhard Schölkopf, Krikamol Muandet

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	At the same signiﬁcance level, our approach s test power is empirically larger than that of the data-splitting approach, regardless of its split proportion.The empirical results suggest that, at the same signiﬁcance level, the test power of our approach is larger than that of the data-splitting approach, regardless of the split proportion (cf. Section 5).We demonstrate the advantages of OST over data-splitting approaches and the Wald test with kernel two-sample testing problems as described in Section 2.
Researcher Affiliation	Collaboration	Jonas M. Kübler Wittawat Jitkrittum Bernhard Schölkopf Krikamol Muandet Max Planck Institute for Intelligent Systems, Tübingen, Germany {jmkuebler, bs, krikamol}@tue.mpg.de, wittawatj@gmail.com Now with Google Research
Pseudocode	Yes	Algorithm 1 One-Sided Test (OST)
Open Source Code	Yes	The code for the experiments is available at https://github.com/MPI-IS/tests-wo-splitting.
Open Datasets	Yes	2. MNIST (p = 49): We consider downsampled 7x7 images of the MNIST dataset [40], where P contains all the digits and Q only uneven digits.
Dataset Splits	No	The paper discusses 'data splitting' where a portion of data is used for 'learning' and the rest for 'testing' (e.g., 'SPLIT0.1 denotes that 10% of the data are used for learning β and 90% are used for testing'). However, it does not explicitly define distinct training, validation, and test splits or provide specific percentages for a three-way split.
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory, or cloud resources) used for running the experiments were provided in the paper.
Software Dependencies	No	The paper mentions 'Sci Py: Open source scientiﬁc tools for Python' [26] and 'The cvxopt linear and quadratic cone program solvers' [43], but no specific version numbers for these or other key software components are provided.
Experiment Setup	Yes	For all the setups we estimate the Type-II error for various sample sizes at a level α = 0.05. Error rates are estimated over 5000 independent trialsFor each dataset we consider three different base sets of kernels K and choose σ with the median heuristic: (a) d = 1: K = [k σ], (b) d = 2: K = [k σ, klin], (c) d = 6: K = [k0.25 σ, k0.5 σ, k σ, k2 σ, k4 σ, klin].