Log-Linear-Time Gaussian Processes Using Binary Tree Kernels

Authors: Michael K. Cohen, Samuel Daulton, Michael A Osborne

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental On a classic suite of regression tasks, we compare our kernel against Matérn, sparse, and sparse variational kernels. The binary tree GP assigns the highest likelihood to the test data on a plurality of datasets, usually achieves lower mean squared error than the sparse methods, and often ties or beats the Matérn GP.
Researcher Affiliation Collaboration Michael K. Cohen University of Oxford michael.cohen@eng.ox.ax.uk Samuel Daulton University of Oxford, Meta sdaulton@meta.com Michael A. Osborne University of Oxford mosb@robots.ox.ax.uk
Pseudocode Yes Algorithm 1 Linear Transformation with SROS Linear Operator. Algorithm 2 Inverse and determinant of I+ SROS Linear Operator.
Open Source Code Yes The code is available at https://github.com/mkc1000/btgp and https://tinyurl.com/btgp-colab.
Open Datasets Yes We evaluate our method on the same open-access UCI datasets [4] as Wang et al. [25]... [4] Dheeru Dua and Casey Graff. UCI machine learning repository, 2017. URL http://archive.ics.uci.edu/ml.
Dataset Splits Yes We evaluate our method on the same open-access UCI datasets [4] as Wang et al. [25], using their same training, validation, and test partitions, and we compare against the baseline results they report.
Hardware Specification Yes using a single GPU (Tesla V100-SXM2-16GB for BT and BTE and Tesla V100-SXM2-32GB for the other methods).
Software Dependencies No No specific software versions (e.g., library or solver names with version numbers) were explicitly stated.
Experiment Setup Yes For the binary tree (BT) kernels, we use p = min(8, b150/dc + 1), and recall q = pd. We set λ = 1/n. We train the bit order and weights to minimize training NLL. For the binary tree ensemble (BTE), we use 20 kernels.