Efficient Gaussian Process Classification Using Pólya-Gamma Data Augmentation

Authors: Florian Wenzel, Théo Galy-Fajou, Christan Donner, Marius Kloft, Manfred Opper5417-5424

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the algorithm on real-world datasets containing up to 11 million data points and demonstrate that it is up to two orders of magnitude faster than the state-of-the-art while being competitive in terms of prediction performance.
Researcher Affiliation Academia 1TU Kaiserslautern, Germany, 2TU Berlin, Germany, 3University of Southern California, USA
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes Our code is available via Github1. 1https://github.com/theogf/Augmented Gaussian Processes.jl
Open Datasets Yes We experiment on 12 datasets from the Open ML website and the UCI repository ranging from 768 to 11 million data points.
Dataset Splits Yes For each dataset we perform a 10-fold cross-validation and for datasets with more than 1 million points, we limit the test set to 100,000 points.
Hardware Specification No The paper only states that 'All algorithms are run on a single CPU.' without providing specific CPU models or other hardware details.
Software Dependencies Yes We use GPflow version 1.2.0.
Experiment Setup Yes The kernel hyperparameters are initialized to the same values and optimized using Adam (Kingma and Ba 2014), while inducing points location are initialized via k-means++ (Arthur and Vassilvitskii 2007) and kept fixed during training. For all datasets, we use 100 inducing points and a mini-batch size of 100 points. For X-GPC we find that the following simple convergence criterion on the global parameters leads to good results: a sliding window average being smaller than a threshold of 10 4 .