Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Efficient Gaussian Process Classification Using Pólya-Gamma Data Augmentation

Authors: Florian Wenzel, Théo Galy-Fajou, Christan Donner, Marius Kloft, Manfred Opper5417-5424

AAAI 2019 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the algorithm on real-world datasets containing up to 11 million data points and demonstrate that it is up to two orders of magnitude faster than the state-of-the-art while being competitive in terms of prediction performance.
Researcher Affiliation Academia 1TU Kaiserslautern, Germany, 2TU Berlin, Germany, 3University of Southern California, USA
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes Our code is available via Github1. 1https://github.com/theogf/Augmented Gaussian Processes.jl
Open Datasets Yes We experiment on 12 datasets from the Open ML website and the UCI repository ranging from 768 to 11 million data points.
Dataset Splits Yes For each dataset we perform a 10-fold cross-validation and for datasets with more than 1 million points, we limit the test set to 100,000 points.
Hardware Specification No The paper only states that 'All algorithms are run on a single CPU.' without providing specific CPU models or other hardware details.
Software Dependencies Yes We use GPflow version 1.2.0.
Experiment Setup Yes The kernel hyperparameters are initialized to the same values and optimized using Adam (Kingma and Ba 2014), while inducing points location are initialized via k-means++ (Arthur and Vassilvitskii 2007) and kept fixed during training. For all datasets, we use 100 inducing points and a mini-batch size of 100 points. For X-GPC we find that the following simple convergence criterion on the global parameters leads to good results: a sliding window average being smaller than a threshold of 10 4 .