Heterogeneous Multi-output Gaussian Process Prediction
Authors: Pablo Moreno-Muñoz, Antonio Artés, Mauricio Álvarez
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We illustrate the performance of the model on synthetic data and two real datasets: a human behavioral study and a demographic high-dimensional dataset. |
| Researcher Affiliation | Academia | 1Dept. of Signal Theory and Communications, Universidad Carlos III de Madrid, Spain 2Dept. of Computer Science, University of Sheffield, UK |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | The code is publicly available in the repository github.com/pmorenoz/Het MOGP/ |
| Open Datasets | Yes | We preprocessed it to translate all property addresses to latitude-longitude points. For each spatial input, we considered two observations, one binary and one real. The first one indicates if the property is or is not a flat (zero would mean detached, semi-detached, terraced, etc.. ), and the second one the sale price of houses. Our goal is to predict features of houses given a certain location in the London area. We used a training set of N = 20, 000 samples, 1, 000 for test predictions and M = 100 inducing points. (https://www.gov.uk/government/collections/price-paid-data). We use our model for predicting a binary output (gender) and a continuous output (logarithmic age) and we compared against independent Chained GPs per output. The binary output is modelled as a Bernoulli distribution and the continuous one as a Gaussian. We obtained an average NLPD value of 0.0191 for both multi-output and independent output models with a slight difference in the standard deviation. (http://archive.ics.uci.edu/ml/) |
| Dataset Splits | Yes | We use a dataset of dimensionality p = 255 and 452 samples that we divide in training, validation and test sets (more details are in the appendix). |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for experiments, such as GPU or CPU models. |
| Software Dependencies | No | The paper mentions software components like "LBFGS-B algorithm", "ADADELTA included in the climin library", and "Python implementation" but does not provide specific version numbers for these components. |
| Experiment Setup | Yes | For all the experiments, we consider an RBF kernel for each covariance function kq( , ) and we set Q = 3. For standard optimization we used the LBFGS-B algorithm. When SVI was needed, we considered ADADELTA included in the climin library, and a mini-batch size of 500 samples in every output. |