Asynchronous Distributed Variational Gaussian Process for Regression
Authors: Hao Peng, Shandian Zhe, Xiao Zhang, Yuan Qi
ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 6. Experiments, Table 1. Root mean square errors (RMSEs) for 700K/100K US Flight data., Figure 1. Root mean square errors (RMSEs) for US flight data as a function of training time. |
| Researcher Affiliation | Collaboration | 1Purdue University, West Lafayette, IN, USA 2Ant Financial Service Group. |
| Pseudocode | Yes | Algorithm 1 Delayed Proximal Gradient for ADVGP |
| Open Source Code | No | The paper does not provide any links to source code or explicit statements about releasing code for the described methodology. |
| Open Datasets | Yes | We used the US Flight data1 (Hensman et al., 2013), which recorded the arrival and departure time of the USA commercial flights between January and April in 2008. ... 1http://stat-computing.org/dataexpo/2009/ and We used the New York city yellow taxi trip dataset 2, which consist of 1.21 billions of trip records from January 2009 to December 2015. ... 2http://www.nyc.gov/html/tlc/html/about/ trip_record_data.shtml |
| Dataset Splits | Yes | in the first group, we randomly chose 700K samples for training; in the second group, we randomly selected 2M training samples. Both groups used 100K samples for testing. We ensured that the training and testing data are non-overlapping. To choose an appropriate delay τ, we sampled another set of training and test data, based on which we tuned τ from {0, 8, 16, 24, 32, 40}. These tunning datasets do not overlap the test data in the evaluation. |
| Hardware Specification | Yes | We ran all the methods on a computer node with 16 CPU cores and 64 GB memory. We conducted two experiments on 4 c4.8xlarge instances of Amazon EC2 cloud. We used Amazon EC2 cloud, and ran ADVGP on multiple Amazon c4.8xlarge instances, each with 36 v CPUs and 60 GB memory. |
| Software Dependencies | No | The paper mentions software like Vowpal Wabbit, PARAMETERSERVER, and ADADELTA, but does not provide specific version numbers for these software dependencies, which are necessary for full reproducibility. |
| Experiment Setup | Yes | For ADVGP, we initialized µ = 0, U = I, and used ADADELTA (Zeiler, 2012) to adjust the step size for the gradient descent before the proximal operation. To choose an appropriate delay τ, we sampled another set of training and test data, based on which we tuned τ from {0, 8, 16, 24, 32, 40}. We chose τ = 32 as it produced the best performance on the tunning datasets. We set m = 50 and initialized the inducing points as the the K-means cluster centers from a subset of 2M training samples. The delay limit τ was selected as 20. |