Giga-scale Kernel Matrix-Vector Multiplication on GPU

Authors: Robert Hu, Siu Lun Chau, Dino Sejdinovic, Joan Glaunès

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate that F3M has empirical linear time and memory complexity with a relative error of order 10 3 and can compute a full KMVM for a billion points in under a minute on a high-end GPU, leading to a significant speed-up in comparison to existing CPU methods.
Researcher Affiliation Collaboration Robert Hu Amazon robyhu@amazon.co.uk Siu Lun Chau Department of Statistics University of Oxford siu.chau@stats.ox.ac.uk Dino Sejdinovic School of Computer and Mathematical Sciences University of Adelaide dino.sejdinovic@adelaide.edu.au Joan Alexis Glaunès MAP5 Université Paris Descartes alexis.glaunes@mi.parisdescartes.fr
Pseudocode No The paper does not contain a dedicated pseudocode block or algorithm listing.
Open Source Code Yes Codebase is released here [14].
Open Datasets Yes We consider uniformly and normally sampled data, the Open Street Map (OSM) dataset [1] and a classification task on the NYC Taxi dataset [2]... We mimic the setup in [32] and consider the datasets 3DRoad, Song, Buzz and House Electric...
Dataset Splits No The paper describes error calculation, not a training/validation split for the model's overall performance. No explicit validation set is mentioned for hyperparameter tuning or early stopping.
Hardware Specification Yes All experiments were run on NVIDIA V100-32GB cards, where the data is fitted entirely on the GPU.
Software Dependencies No The paper mentions 'Lib Torch' (PyTorch) but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes The parameters used for F3M are η = 0.1, 0.2, 0.3, 0.5 and r = 2D, 3D, 4D with a cap at r = 2048.