Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

ParK: Sound and Efficient Kernel Ridge Regression by Feature Space Partitions

Authors: Luigi Carratino, Stefano Vigogna, Daniele Calandriello, Lorenzo Rosasco

NeurIPS 2021 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section we study the performance of Par K on some large-scale datasets (n 10^6, 10^7, 10^9).
Researcher Affiliation Collaboration Luigi Carratino Ma LGa DIBRIS, University of Genova EMAIL Stefano Vigogna Ma LGa DIBRIS, University of Genova EMAIL Daniele Calandriello Deep Mind Paris EMAIL Lorenzo Rosasco Ma LGa DIBRIS, University of Genova IIT, CBMM MIT EMAIL
Pseudocode Yes Algorithm 1 Par K: Train [...] Algorithm 2 Par K: Predict
Open Source Code No The paper states that the experiments are 'implemented in python using pytorch and the FALKON library [22]' but does not provide an explicit statement or link for the open-source release of the Par K algorithm's code.
Open Datasets Yes We perform experiments on the four large-scale datasets TAXI (n 10^9, d = 9, regression), HIGGS (n 10^7, d = 28, classification), AIRLINE (n 10^6, d = 8, regression), AIRLINE-CLS (n 10^6, d = 8, classification) with the same pre-processing and same random train/test split used in [22].
Dataset Splits No The paper mentions using a 'random train/test split' and states 'We do not cross validate hyper-parameters of the local estimators of Par K', but does not specify details about validation dataset splits or percentages.
Hardware Specification Yes The experiments run on a machine with 2 Intel Xeon Silver 4116 CPUs and 1 GPU NVIDIA Titan Xp. The ram of the machine is 256 GB.
Software Dependencies No The paper mentions 'implemented in python using pytorch and the FALKON library [22]', but does not provide specific version numbers for these software components.
Experiment Setup Yes We do not cross validate hyper-parameters of the local estimators of Par K. Instead we use the same used by FALKON in the paper [22] with the following exeptions: let λ be the global regularization parameters of FALKON and m the number of the Nyström points, the local estimators of Par K use regularization λq = λρ^-1q and mq = mρq as suggested by the theory. [...] The number of centroids used by Par K and D&C-FALK is Q = 32 for all experiments.