Scalable Training of Inference Networks for Gaussian-Process Models
Authors: Jiaxin Shi, Mohammad Emtiyaz Khan, Jun Zhu
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results show comparable and, sometimes, superior performance to existing sparse variational GP methods. 5. Experiments Throughout all experiments, M denotes both the number of inducing points in SVGP and the number of measurement points in GPNet and FBNN (Sun et al., 2019). |
| Researcher Affiliation | Academia | 1Dept. of Comp. Sci. & Tech., Institute for AI, BNRist Center, THBI Lab, Tsinghua University, Beijing, China 2RIKEN Center for Advanced Intelligence project, Tokyo, Japan. Correspondence to: Jiaxin Shi <shijx15@mails.tsinghua.edu.cn>, Jun Zhu <dcszj@tsinghua.edu.cn>. |
| Pseudocode | Yes | Algorithm 1 GPNet for supervised learning |
| Open Source Code | Yes | Code is available at https: //github.com/thjashin/gp-infer-net. |
| Open Datasets | Yes | We consider the inference of a GP with RBF kernel on the synthetic dataset introduced in Snelson & Ghahramani (2006). We evaluate our method on seven standard regression benchmark datasets. We conducted experiments on the airline delay dataset, which includes 5.9 million flight records in the USA from Jan to Apr in 2018. We test GPNet on MNIST and CIFAR10 with a CNN-GP prior. |
| Dataset Splits | Yes | The regression results are averaged over 10 random splits for small datasets (n < 5000) and 3 splits for large datasets (n >= 5000). Following the protocol in Hensman et al. (2013), we randomly take 700K points for training and 100K for testing. |
| Hardware Specification | No | No specific hardware details (e.g., CPU/GPU models, memory) used for running experiments are mentioned in the paper. |
| Software Dependencies | No | Implementations are based on a customized version of GPflow (de G. Matthews et al., 2017; Sun et al., 2018) and Zhu Suan (Shi et al., 2017). No version numbers for software are provided. |
| Experiment Setup | Yes | We ran for 40K iterations and used learning rate 0.003 for all methods. For fair comparison, for all three methods we pretrain the prior hyperparameters for 100 iterations using the GP marginal likelihood and keep them fixed thereafter. We vary M in {2, 5, 20} for all methods. The networks used in GPNet and FBNN are the same RFE with 20 hidden units. |