Streaming Sparse Gaussian Process Approximations
Authors: Thang D. Bui, Cuong Nguyen, Richard E. Turner
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The proposed framework is assessed using synthetic and real-world datasets. In this section, the SSGP method is evaluated in terms of speed, memory usage, and accuracy (loglikelihood and error). The method was implemented on GPflow [20] and compared against GPflow s version of the following baselines: exact GP (GP), sparse GP using the collapsed bound (SGP), and stochastic variational inference using the uncollapsed bound (SVI). |
| Researcher Affiliation | Academia | Thang D. Bui Cuong V. Nguyen Richard E. Turner Department of Engineering, University of Cambridge, UK {tdb40,vcn22,ret26}@cam.ac.uk |
| Pseudocode | No | The paper describes the proposed method mathematically but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | An implementation of the proposed method can be found at http://github.com/thangbui/streaming_sparse_gp. |
| Open Datasets | Yes | We first consider modelling a segment of the pseudo periodic synthetic dataset [22], previously used for testing indexing schemes in time-series databases. The second dataset is an audio signal prediction dataset, produced from the TIMIT database [23] and previously used to evaluate GP approximations [24]. The second set of experiments consider the OS Terrain 50 dataset that contains spot heights of landscapes in Great Britain computed on a grid. The dataset is available at: https://data.gov.uk/dataset/os-terrain-50-dtm. |
| Dataset Splits | No | The paper specifies interleaved training and testing sets (e.g., 'Training and testing sets are chosen interleaved so that their sizes are both 12,000') but does not explicitly mention a distinct validation set or the methodology for creating one. |
| Hardware Specification | No | The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running the experiments. |
| Software Dependencies | No | The method was implemented on GPflow [20]. While GPflow is named and cited (GPflow: A Gaussian process library using Tensor Flow, Journal of Machine Learning Research, 2017), a specific version number for GPflow or TensorFlow is not provided, nor are any other software dependencies with version numbers. |
| Experiment Setup | Yes | In all the experiments, the RBF kernel with ARD lengthscales is used... All algorithms are assessed in the mini-batch streaming setting with data ynew arriving in batches of size 300 and 500... The first 1,000 examples are used as an initial training set... For SVI, we allow the algorithm to make 100 stochastic gradient updates during each iteration and run preliminary experiments to compare 3 learning rates r = 0.001, 0.01, and 0.1... For all sparse methods (SSGP, SGP, and SVI), we run the experiments with 100 and 200 pseudo-points. |