Inter-domain Deep Gaussian Processes
Authors: Tim G. J. Rudner, Dino Sejdinovic, Yarin Gal
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We assess the performance of our method on a range of regression tasks and demonstrate that it outperforms inter-domain shallow GPs and conventional DGPs on challenging large-scale realworld datasets exhibiting both global structure as well as a high-degree of non-stationarity. |
| Researcher Affiliation | Academia | 1Department of Computer Science, University of Oxford, Oxford, United Kingdom 2Department of Statistics, University of Oxford, Oxford, United Kingdom. |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | 1For source code and additional results, see https:// bit.ly/inter-domain-dgps. |
| Open Datasets | Yes | We use a smoothed sub-band of a speech signal taken from the TIMIT database and previously used in Bui & Turner (2014). To quantitatively assess the predictive performance of interdomain DGPs, we evaluate them on a range of real-world dataset, which exhibit global structure usually in the form of a temporal component that induces a high autocorrelation. The experiments include medium-sized datasets ( parking , air , traffic ), two very large datasets with over two and five million datapoints each ( power and airline ), and a highdimensional dataset with 27 input dimensions ( appliances ). To assess the predictive performance of inter-domain DGPs on extremely complex, non-stationary data, we test our method on the U.S. flight delay prediction problem, a large-scale regression problem that has reached a status of a standard test in GP regression due to its massive size of 5, 929, 413 observations and its non-stationary nature, which makes it challenging for GPs with stationary covariance functions (Hensman et al., 2018). |
| Dataset Splits | No | The paper mentions a test set (40%) and training, but does not explicitly provide details about a separate validation set split or its proportion. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library or solver names with version numbers. |
| Experiment Setup | Yes | To avoid undesirable edge effects in the DGP posterior predictive distributions, we normalize all input data dimensions to lie in the interval [0, 1] and define the RKHS over the interval [a, b] = [ 2, 3]. We repeat this normalization at each DGP layer before feeding the samples into the next GP. To avoid pathologies in DGP models investigated in prior work (Duvenaud et al., 2014), we follow Salimbeni & Deisenroth (2017) and use a linear mean function m( )(f ( 1)) = f ( 1) w( ), where w( ) is a vector of weights, for all but the final-layer GP, for which we use a zero mean function. We used a Mat ern3 2 kernel for all experiments. All models were trained with 50 inducing points. The inter-domain DGP with DSVI has two layers and the conventional DGP with DSVI has four layers. |