reproducibilityindex.ai

Entrywise error bounds for low-rank approximations of kernel matrices

Authors: Alexander Modell

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we validate our theory with an empirical study of a collection of synthetic and real-world datasets.
Researcher Affiliation	Academia	Alexander Modell Department of Mathematics Imperial College London, U.K. a.modell@imperial.ac.uk
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks. It focuses on theoretical derivations and proofs.
Open Source Code	Yes	Code to reproduce the experiments in this section can be found at https://gist.github.com/alexandermodell/b16b0b29b6d0a340a23dab79219133f2.
Open Datasets	Yes	GMM is a synthetic dataset... Abalone [Nash et al., 1995]... Wine Quality [Cortez et al., 2009]... MNIST [Deng, 2012]... 20 Newsgroups [Lang, 1995]... Zebrafish [Wagner et al., 2018].
Dataset Splits	No	The paper does not explicitly provide training/test/validation dataset splits. It mentions that some datasets are
Hardware Specification	Yes	The experiments were performed on the HPC cluster at Imperial College London with 8 cores and 16GB of RAM.
Software Dependencies	No	The paper mentions using 'the svds function in the SciPy library for Python [Virtanen et al., 2020]' but does not provide specific version numbers for Python or SciPy.
Experiment Setup	Yes	For each dataset, we construct four kernel matrices using Matérn kernels with smoothness parameters ν = 1/2, 3/2, 5/2, ∞, each time selecting the bandwidth using the median heuristic. For each kernel, we compute the best rank-d low-rank approximation of the kernel matrix using the svds function in the SciPy library for Python [Virtanen et al., 2020]. We do this for a range of ranks d from 1 to n, where n is the number of instances in the dataset, and record the entrywise errors.