Bayesian Learning from Sequential Data using Gaussian Processes with Signature Covariances
Authors: Csaba Toth, Harald Oberhauser
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We then combine the resulting GP with LSTMs and GRUs to build larger models that leverage the strengths of each of these approaches and benchmark the resulting GPs on multivariate time series (TS) classification datasets. |
| Researcher Affiliation | Academia | Csaba Toth 1 Harald Oberhauser 1 1Mathematical Institute, University of Oxford, Oxford, United Kingdom. Correspondence to: Csaba Toth <csaba.toth@maths.ox.ac.uk>, Harald Oberhauser <harald.oberhauser@maths.ox.ac.uk>. |
| Pseudocode | Yes | Algorithm 1 Computing the inducing covariances KZZ; Algorithm 2 Computing the cross-covariances KZX |
| Open Source Code | Yes | Code and benchmarks are publically available at http://github.com/tgcsaba/GPSig. |
| Open Datasets | Yes | We benchmarked these GP models on 16 multivariate TS classification datasets, a collection introduced in (Baydogan, 2015) that has become a semistandard archive in TS classification... For this experiment, we took the AUSLAN dataset (Dua & Graff, 2017), which consists of nc = 95 classes for n X = 1140 training examples. |
| Dataset Splits | Yes | The RNN-architectures were selected independently for all models by grid-search among 6 variants, that is, the number of hidden units from [8, 32, 128] and with or without dropout. For training, early stopping was used with n = 500 epochs patience; a learning rate of α = 1 10 3; a minibatch size of 50; as optimizer Adam (Kingma & Ba, 2014) and Nadam (Dozat, 2015) were employed. |
| Hardware Specification | Yes | All experiments were run on a single NVIDIA GeForce GTX 1080 GPU. |
| Software Dependencies | Yes | We implemented our models using Python 3.7.3 and GPFlow 1.5.0. |
| Experiment Setup | Yes | We used n Z = 500 for all models; further all use a static kernel in one form or another, which we fixed to be the RBF kernel. The signature kernel was truncated at M = 4, and for GP-Sig p = 1 lags were used; the GP-Sig-RNNs did not use lags... The window size in GP-KConv-1D was set to w = 10. The RNN-architectures were selected independently for all models by grid-search among 6 variants, that is, the number of hidden units from [8, 32, 128] and with or without dropout. For training, early stopping was used with n = 500 epochs patience; a learning rate of α = 1 10 3; a minibatch size of 50; as optimizer Adam (Kingma & Ba, 2014) and Nadam (Dozat, 2015) were employed. |