reproducibilityindex.ai

Supervised Kernel Thinning

Authors: Albert Gong, Kyuseong Choi, Raaz Dwivedi

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate our design choices with both simulations and real data experiments.
Researcher Affiliation	Academia	Albert Gong Kyuseong Choi Raaz Dwivedi Cornell Tech, Cornell University agong,kc728,dwivedi@cornell.edu
Pseudocode	Yes	Algorithm 1: KT-COMPRESS++ Identify coreset of size n... Algorithm 3b: KT-SWAP Identify and refine the best candidate coreset
Open Source Code	Yes	Our code can be found at https://github.com/ag2435/npr.
Open Datasets	Yes	California Housing regression dataset from Pace and Barry [17] (https://scikit-learn.org/1.5/datasets/ real_world.html#california-housing-dataset; BSD-3-Clause license) and the SUSY binary classification dataset from Baldi et al. [2] (https://archive.ics.uci.edu/dataset/ 279/susy; CC-BY-4.0 license).
Dataset Splits	Yes	Specifically, we use a held-out validation set of size 104 and run each parameter configuration 100 times to estimate the validation MSE since KT-KRR and ST-KRR are random.
Hardware Specification	Yes	All our experiments were run on a machine with 8 CPU cores and 100 GB RAM.
Software Dependencies	No	The paper mentions 'Matlab implementation' and 'Cython implementation' but does not provide specific version numbers for these software dependencies.
Experiment Setup	Yes	We select the bandwidth h and regularization parameter λ (for KRR) using grid search. For all methods, we use the Gaussian kernel (23) with bandwidth h = 10. We use λ = λ = 10 3 for FULL-KRR, ST-KRR, and KT-KRR and λ = 10 5 for RPCHOLESKY-KRR. All parameters are chosen with cross-validation.