reproducibilityindex.ai

Multi-task Learning for Aggregated Data using Gaussian Processes

Authors: Fariba Yousefi, Michael T. Smith, Mauricio Álvarez

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show examples of the model in a synthetic example, a fertility dataset and an air pollution prediction application.
Researcher Affiliation	Academia	Department of Computer Science, University of Shefﬁeld {f.yousefi, m.t.smith, mauricio.alvarez}@sheffield.ac.uk
Pseudocode	No	The paper describes algorithms and mathematical formulations but does not contain structured pseudocode or algorithm blocks with clear labels.
Open Source Code	Yes	The implementation is based on the GPy framework and is available on Github: https://github.com/frb-yousefi/aggregated-multitask-gp.
Open Datasets	Yes	a subset of the Canadian fertility dataset is used from the Human Fertility Database (HFD) 1. The dataset consists of live births statistics by year, age of mother and birth order. ...1https://www.humanfertility.org
Dataset Splits	Yes	For training the multi-task model, we select N1 = 200 from the 250 observations for task 1 and use all N2 = 125 for the second task. The other 50 data points for task 1 correspond to a gap in the interval [130, 180] that we use as the test set. ... The dataset was randomly split into 1640 training points and 1000 test points.
Hardware Specification	No	The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies	No	The paper mentions using "LBFGS-B algorithm" and "the Adam optimiser, included in climin library", and that "The implementation is based on the GPy framework", but it does not provide specific version numbers for these software components.
Experiment Setup	Yes	In these examples, we use k-means clustering over the input data, with k = M, to initialise the values of the inducing inputs, Z, which are also kept ﬁxed during optimisation. ... We used 100 ﬁxed inducing variables and mini-batches of size 50 samples. ... We used 2000 iterations of the variational EM algorithm, with 200 evenly spaced inducing points and a ﬁxed lengthscale of 0.75 hours. We only optimise the parameters of the coregionalisation matrix B1 R2 2 and the variance of the noise of each Gaussian likelihood.