Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
ParK: Sound and Efficient Kernel Ridge Regression by Feature Space Partitions
Authors: Luigi Carratino, Stefano Vigogna, Daniele Calandriello, Lorenzo Rosasco
NeurIPS 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section we study the performance of Par K on some large-scale datasets (n 10^6, 10^7, 10^9). |
| Researcher Affiliation | Collaboration | Luigi Carratino Ma LGa DIBRIS, University of Genova EMAIL Stefano Vigogna Ma LGa DIBRIS, University of Genova EMAIL Daniele Calandriello Deep Mind Paris EMAIL Lorenzo Rosasco Ma LGa DIBRIS, University of Genova IIT, CBMM MIT EMAIL |
| Pseudocode | Yes | Algorithm 1 Par K: Train [...] Algorithm 2 Par K: Predict |
| Open Source Code | No | The paper states that the experiments are 'implemented in python using pytorch and the FALKON library [22]' but does not provide an explicit statement or link for the open-source release of the Par K algorithm's code. |
| Open Datasets | Yes | We perform experiments on the four large-scale datasets TAXI (n 10^9, d = 9, regression), HIGGS (n 10^7, d = 28, classification), AIRLINE (n 10^6, d = 8, regression), AIRLINE-CLS (n 10^6, d = 8, classification) with the same pre-processing and same random train/test split used in [22]. |
| Dataset Splits | No | The paper mentions using a 'random train/test split' and states 'We do not cross validate hyper-parameters of the local estimators of Par K', but does not specify details about validation dataset splits or percentages. |
| Hardware Specification | Yes | The experiments run on a machine with 2 Intel Xeon Silver 4116 CPUs and 1 GPU NVIDIA Titan Xp. The ram of the machine is 256 GB. |
| Software Dependencies | No | The paper mentions 'implemented in python using pytorch and the FALKON library [22]', but does not provide specific version numbers for these software components. |
| Experiment Setup | Yes | We do not cross validate hyper-parameters of the local estimators of Par K. Instead we use the same used by FALKON in the paper [22] with the following exeptions: let λ be the global regularization parameters of FALKON and m the number of the Nyström points, the local estimators of Par K use regularization λq = λρ^-1q and mq = mρq as suggested by the theory. [...] The number of centroids used by Par K and D&C-FALK is Q = 32 for all experiments. |