reproducibilityindex.ai

Efficient Gaussian Process Classification Using Pólya-Gamma Data Augmentation

Authors: Florian Wenzel, Théo Galy-Fajou, Christan Donner, Marius Kloft, Manfred Opper5417-5424

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate the algorithm on real-world datasets containing up to 11 million data points and demonstrate that it is up to two orders of magnitude faster than the state-of-the-art while being competitive in terms of prediction performance.
Researcher Affiliation	Academia	1TU Kaiserslautern, Germany, 2TU Berlin, Germany, 3University of Southern California, USA
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	Our code is available via Github1. 1https://github.com/theogf/Augmented Gaussian Processes.jl
Open Datasets	Yes	We experiment on 12 datasets from the Open ML website and the UCI repository ranging from 768 to 11 million data points.
Dataset Splits	Yes	For each dataset we perform a 10-fold cross-validation and for datasets with more than 1 million points, we limit the test set to 100,000 points.
Hardware Specification	No	The paper only states that 'All algorithms are run on a single CPU.' without providing specific CPU models or other hardware details.
Software Dependencies	Yes	We use GPﬂow version 1.2.0.
Experiment Setup	Yes	The kernel hyperparameters are initialized to the same values and optimized using Adam (Kingma and Ba 2014), while inducing points location are initialized via k-means++ (Arthur and Vassilvitskii 2007) and kept ﬁxed during training. For all datasets, we use 100 inducing points and a mini-batch size of 100 points. For X-GPC we ﬁnd that the following simple convergence criterion on the global parameters leads to good results: a sliding window average being smaller than a threshold of 10 4 .