Variational Learning on Aggregate Outputs with Gaussian Processes
Authors: Ho Chung Law, Dino Sejdinovic, Ewan Cameron, Tim Lucas, Seth Flaxman, Katherine Battle, Kenji Fukumizu
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We apply our framework to a challenging and important problem, the fine-scale spatial modelling of malaria incidence, with over 1 million observations.Our contributions can be summarised as follows. A general framework is developed... In experiments, it is demonstrated that the proposed methods can scale to dataset sizes of more than 1 million observations. We thoroughly investigate an application of the developed methodology to disease mapping from coarse measurements, where the observation model is Poisson, giving encouraging results. |
| Researcher Affiliation | Academia | Ho Chung Leon Law University of Oxford Dino Sejdinovic University of Oxford Ewan Cameron University of Oxford Tim CD Lucas University of Oxford Seth Flaxman Imperial College London Katherine Battle University Of Oxford Kenji Fukumizu Institute of Statistical Mathematics |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available on https://github.com/hcllaw/VBAgg |
| Open Datasets | Yes | We first demonstrate our method on the swiss roll dataset7... The swiss roll manifold function (for sampling) can be found on the Python scikit-learn package. |
| Dataset Splits | Yes | we split the dataset into 4 parts, namely train, early-stop, validation and test set.We consider 576 bags for train, 95 bags each for validation and early-stop, with 191 bags for testing, with different splits across different trials, selecting them to ensure distributions of labels are similar across sets. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running its experiments, only mentioning the use of "Tensor Flow". |
| Software Dependencies | No | The paper mentions "Tensor Flow" and "Adam [12]" but does not specify version numbers for these or any other software libraries, which is required for reproducibility. |
| Experiment Setup | Yes | We implement our models in Tensor Flow6 and use SGD with Adam [12] to optimise their respective objectives, and we split the dataset into 4 parts, namely train, early-stop, validation and test set... The validation set is used for parameter tuning of any regularisation scaling, as well as learning rate, layer size and multiple initialisations. For the choice of k for VBAgg and Nyström, we use the RBF kernel, with the bandwidth parameter learnt. For landmark locations, we use the K-means++ algorithm. |