Scalable Nonparametric Bayesian Inference on Point Processes with Gaussian Processes

Authors: Yves-Laurent Kom Samo, Stephen Roberts

ICML 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We propose a MCMC sampler and show that the model obtained is faster, more accurate and generates less correlated samples than competing approaches on both synthetic and real-life data. We selected four data sets to illustrate the performance of our model. We restricted ourselves to one synthetic data set for brevity.
Researcher Affiliation Academia Yves-Laurent Kom Samo YLKS@ROBOTS.OX.AC.UK Stephen Roberts SJROB@ROBOTS.OX.AC.UK Deparment of Engineering Science and Oxford-Man Institute, University of Oxford
Pseudocode Yes Algorithm 1 Selection of inducing points
Open Source Code No The paper does not contain any explicit statement or link indicating that the source code for their methodology is publicly available.
Open Datasets Yes We ran our model on a standard 1 dimensional real-life data set (the coal mine disasters dataset used in (Jarrett, 1979); 191 points) and a standard real-life 2 dimensional data (spatial location of bramble canes (Diggle, 1983); 823 points). Finally we ran our model on a reallife data set large enough to cause problems to competing models. This data set consists of the UTC timestamps (expressed in hours in the day) of Twitter updates in English published in the (Twitter sample stream, 2014) on September 1st 2014 (188544 points).
Dataset Splits No The paper describes using synthetic and real-life datasets and evaluates predictive probability on '10 held out PPP draws' for the LP metric, but it does not specify explicit training, validation, or test dataset splits (e.g., percentages, sample counts, or cross-validation setup) needed for full reproducibility.
Hardware Specification No The paper mentions that a 'typical personal computer cannot handle' certain computations for competing models, but it does not provide any specific details about the hardware used for their own experiments (e.g., GPU/CPU models, memory).
Software Dependencies No The paper mentions general scientific programming packages like 'R, Matlab and Scipy' but does not provide specific version numbers for any software dependencies used in their experiments.
Experiment Setup Yes We use a squared exponential kernel for γ and scaled sigmoid Gaussian priors for the kernel hyper-parameters; that is θi = θimax 1+exp( xi) where xi are i.i.d standard Normal. In each experiment we generated 5000 samples after burn-in (1000 samples). For each data set we used the set of inducing points that yielded a 95% normalized utility. The Legendre polynomial order p = 20. (Table 1 values: SYNTHETIC h MAX 10.0 l MAX 25.0, COAL MINE h MAX 10.0 l MAX 50.0, BRAMBLE h MAX 10.0 l MAX 0.25, TWITTER h MAX 10.0 l MAX 5.0)