Supervised Kernel Thinning
Authors: Albert Gong, Kyuseong Choi, Raaz Dwivedi
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate our design choices with both simulations and real data experiments. |
| Researcher Affiliation | Academia | Albert Gong Kyuseong Choi Raaz Dwivedi Cornell Tech, Cornell University agong,kc728,dwivedi@cornell.edu |
| Pseudocode | Yes | Algorithm 1: KT-COMPRESS++ Identify coreset of size n... Algorithm 3b: KT-SWAP Identify and refine the best candidate coreset |
| Open Source Code | Yes | Our code can be found at https://github.com/ag2435/npr. |
| Open Datasets | Yes | California Housing regression dataset from Pace and Barry [17] (https://scikit-learn.org/1.5/datasets/ real_world.html#california-housing-dataset; BSD-3-Clause license) and the SUSY binary classification dataset from Baldi et al. [2] (https://archive.ics.uci.edu/dataset/ 279/susy; CC-BY-4.0 license). |
| Dataset Splits | Yes | Specifically, we use a held-out validation set of size 104 and run each parameter configuration 100 times to estimate the validation MSE since KT-KRR and ST-KRR are random. |
| Hardware Specification | Yes | All our experiments were run on a machine with 8 CPU cores and 100 GB RAM. |
| Software Dependencies | No | The paper mentions 'Matlab implementation' and 'Cython implementation' but does not provide specific version numbers for these software dependencies. |
| Experiment Setup | Yes | We select the bandwidth h and regularization parameter λ (for KRR) using grid search. For all methods, we use the Gaussian kernel (23) with bandwidth h = 10. We use λ = λ = 10 3 for FULL-KRR, ST-KRR, and KT-KRR and λ = 10 5 for RPCHOLESKY-KRR. All parameters are chosen with cross-validation. |