Memory-Based Dual Gaussian Processes for Sequential Learning
Authors: Paul Edmund Chang, Prakhar Verma, S. T. John, Arno Solin, Mohammad Emtiyaz Khan
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate its effectiveness in several applications involving Bayesian optimization, active learning, and continual learning. We perform a range of experiments to show the capability of the proposed method on various sequential learning problems. |
| Researcher Affiliation | Academia | 1Department of Computer Science, Aalto University, Finland 2Finnish Center for Artificial Intelligence (FCAI) 3RIKEN Center for AI Project, Tokyo, Japan. |
| Pseudocode | Yes | Algorithm 1 Dual-SVGP with memory |
| Open Source Code | Yes | A reference implementation of the methods presented in this paper is available at: https://github.com/ Aalto ML/sequential-gp. |
| Open Datasets | Yes | We consider a setup where the banana data set and UCI (Dua & Graff, 2017) data sets are converted into streaming setups. For split MNIST (see Sec. 4.3), we use the standard MNIST data provided by Tensor Flow. For the robot experiment for learning magentic field anomalies, we use the data from Solin et al. (2018) that is available at https://github.com/Aalto ML/magnetic-data. |
| Dataset Splits | Yes | For split MNIST (see Sec. 4.3), we use the standard MNIST data provided by Tensor Flow. We concatenate the standard train and test set provided and split it 80 : 20 for training and testing. Table 1. UCI data sets: negative log predictive density, mean (standard deviation) over 10-fold cross-validation, lower is better. For converting the data sets into a streaming setting, we sort the data on the first dimension and split the data set into 50 subsets for all data sets apart from Mammographic (20 subsets). |
| Hardware Specification | No | The paper mentions 'On the same GPU' and 'computational resources provided by the Aalto Science-IT project and CSC IT Center for Science, Finland'. However, it does not specify any particular GPU models, CPU models, or other detailed hardware specifications used for the experiments. |
| Software Dependencies | No | The paper mentions 'Tensor Flow' and 'Adam optimizer (Kingma & Ba, 2015)' but does not provide specific version numbers for these or any other software libraries, environments, or tools used, which are necessary for full reproducibility. |
| Experiment Setup | Yes | For hyperparameter learning in our proposed model, we use the Adam optimizer (Kingma & Ba, 2015) with learning rate 10-2 for 100 iterations for each set of data. We use 10 latent GPs which matches the number of classes, with 300 inducing variables, and use a softmax likelihood. The number of memory points for each set of tasks is set to 400. For optimization of the hyperparameters, we use the Adam optimizer with learning rate 10-2 for 20,000 iterations. |