Distributed Variational Inference in Sparse Gaussian Process Regression and Latent Variable Models
Authors: Yarin Gal, Mark van der Wilk, Carl Edward Rasmussen
NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experimentally study the properties of the suggested inference showing that the inference scales well with data and computational resources, and showing that the inference running time scales inversely with computational power. We run regression experiments on 2008 US flight data with 2 million records and perform classification tests on MNIST using the latent variable model. |
| Researcher Affiliation | Academia | University of Cambridge {yg279,mv310,cer54}@cam.ac.uk |
| Pseudocode | No | The paper describes a 'Distributed Inference Algorithm' in section 4.2 using numbered steps, but it is presented in paragraph form rather than as a structured pseudocode or algorithm block. |
| Open Source Code | Yes | The proposed inference was implemented in Python using the Map-Reduce framework [Dean and Ghemawat, 2008] to work on multi-core architectures, and is available as an open-source package1. 1see http://github.com/markvdw/GParML |
| Open Datasets | Yes | We evaluate GP regression on the US flight dataset [Hensman et al., 2013] with up to 2 million points, and compare the results that we got to an array of baselines demonstrating the utility of using GPs for large scale regression. We then present density modelling results over the MNIST dataset, performing imputation tests and digit classification based on model comparison [Titsias and Lawrence, 2010]. |
| Dataset Splits | Yes | We selected the first 800K points from the dataset and then split the data randomly into a test set and a training set, using 100K points for testing. We then used the first 7K and 70K points from the large training set to construct the smaller training sets, using the same test set for comparison. |
| Hardware Specification | No | The paper mentions 'on a 64 cores machine' but does not specify any particular CPU or GPU models, memory, or other detailed hardware specifications for the experiments. |
| Software Dependencies | No | The paper states 'The proposed inference was implemented in Python using the Map-Reduce framework', but it does not specify any version numbers for Python or any other software libraries or dependencies used in the experiments. |
| Experiment Setup | Yes | We initialise our latent points using PCA and our inducing inputs using k-means with added noise. We optimise using both L-BFGS and scaled conjugate gradient [Møller, 1993]. ... We trained our model with 100 inducing points for 500 iterations using LBFGS optimisation... Training on the full MNIST dataset took 20 minutes for the longest running model, using 500 iterations of SCG. |