Quasi-Monte Carlo Graph Random Features

Authors: Isaac Reid, Krzysztof M Choromanski, Adrian Weller

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate empirical accuracy improvements on a variety of tasks including a new practical application: time-efficient approximation of the graph diffusion process. To our knowledge, q-GRFs constitute the first rigorously studied quasi-Monte Carlo scheme for kernels defined on combinatorial objects, inviting new research on correlations between graph random walks. We conduct an exhaustive set of experiments in Sec. 4 to compare q-GRFs to GRFs, including: (a) quality of kernel approximation via computation of the relative Frobenius norm; (b) simulation of the graph diffusion process; (c) kernelised k-means node clustering; and (d) kernel regression for node attribute prediction.
Researcher Affiliation Collaboration Isaac Reid University of Cambridge ir337@cam.ac.uk Krzysztof Choromanski Google Deep Mind Columbia University kchoro@google.com Adrian Weller University of Cambridge Alan Turing Institute aw665@cam.ac.uk
Pseudocode No The paper describes algorithms and references Algorithm 1.1 from Choromanski [2023], but it does not present pseudocode or a clearly labeled algorithm block within its own text.
Open Source Code Yes Source code is available at https://github.com/isaac-reid/antithetic_termination.
Open Datasets Yes The real-world graphs and meshes were accessed from Ivashkin [2023] and Dawson-Haggerty [2023], with further information about the datasets available therein. Where we were able to locate them, the original papers presenting the graphs are: Zachary [1977], Lusseau et al. [2003], Newman [2006], Bollacker et al. [1998], Leskovec et al. [2007].
Dataset Splits No The paper describes specific experimental setups for tasks like kernel approximation, graph diffusion simulation, k-means clustering, and kernel regression (e.g., 5% missing vectors for prediction). However, it does not explicitly provide details about standard training, validation, and testing dataset splits (e.g., percentages or sample counts) for the overall experiments to reproduce data partitioning.
Hardware Specification Yes The experiments in Secs. 4.1, 4.2 and 4.4 were carried out on an Intel Core i5-7640X CPU @ 4.00GHz 4. Each required 1 CPU hour. The experiments in Sec. 4.3 were carried out on a 2-core Xeon 2.2GHz with 13GB RAM and 33GB HDD. The computations for the largest considered graphs took 1 CPU hour.
Software Dependencies No The paper does not provide specific version numbers for software dependencies or libraries used in the experiments.
Experiment Setup Yes We use the regulariser σ = 0.1 and the termination probability p = 0.5.