Kernel Interpolation with Sparse Grids

Authors: Mohit Yadav, Daniel R. Sheldon, Cameron Musco

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we empirically evaluate the time and memory taken by Algorithm 1 for matrix-vector multiplication with the sparse grid kernel matrix, the accuracy of sparse grid interpolation and GP regression as the data dimension d increases, and the accuracy of sparse grid kernel interpolation for GP regression on real higher-dimensional datasets from UCI.
Researcher Affiliation Academia Mohit Yadav University of Massachusetts Amherst ymohit@cs.umass.edu Daniel Sheldon University of Massachusetts Amherst sheldon@cs.umass.edu Cameron Musco University of Massachusetts Amherst cmusco@cs.umass.edu
Pseudocode Yes Algorithm 1 Sparse Grid Kernel-MVM Algorithm
Open Source Code Yes We provide an efficient GPU implementation of the proposed algorithm compatible with GPy Torch [9], which is available at https://github.com/ymohit/skisg and licensed under the MIT license.
Open Datasets Yes To evaluate the effectiveness of our proposed methods for scaling GP kernel interpolation to higher dimensions, we consider all UCI [4] data sets with dimension 8 d 10. [4] Dheeru Dua and Casey Graff. Uci machine learning repository, 2017. URL http://archive. ics.uci.edu/ml.
Dataset Splits Yes For all datasets, we use a 90/10 train/test split of the data and use the remaining 10% as validation data.
Hardware Specification Yes For d = 10, both methods with cubic interpolation run out of GPU memory, which is 48 GB for this experiment. All experiments were run on a single NVIDIA GeForce RTX 3090 with 24GB of GPU memory.
Software Dependencies No The paper mentions implementing the models in "PyTorch (Paszke et al., 2019) and GPy Torch (Gardner et al., 2018)" but does not provide specific version numbers for these software dependencies, which are necessary for reproducible descriptions.
Experiment Setup Yes We train all our models for 500 iterations using Adam optimizer with an initial learning rate of 0.01 and a learning rate decay of 0.1 every 250 iterations.