Memory-Based Dual Gaussian Processes for Sequential Learning

Authors: Paul Edmund Chang, Prakhar Verma, S. T. John, Arno Solin, Mohammad Emtiyaz Khan

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate its effectiveness in several applications involving Bayesian optimization, active learning, and continual learning. We perform a range of experiments to show the capability of the proposed method on various sequential learning problems.
Researcher Affiliation Academia 1Department of Computer Science, Aalto University, Finland 2Finnish Center for Artificial Intelligence (FCAI) 3RIKEN Center for AI Project, Tokyo, Japan.
Pseudocode Yes Algorithm 1 Dual-SVGP with memory
Open Source Code Yes A reference implementation of the methods presented in this paper is available at: https://github.com/ Aalto ML/sequential-gp.
Open Datasets Yes We consider a setup where the banana data set and UCI (Dua & Graff, 2017) data sets are converted into streaming setups. For split MNIST (see Sec. 4.3), we use the standard MNIST data provided by Tensor Flow. For the robot experiment for learning magentic field anomalies, we use the data from Solin et al. (2018) that is available at https://github.com/Aalto ML/magnetic-data.
Dataset Splits Yes For split MNIST (see Sec. 4.3), we use the standard MNIST data provided by Tensor Flow. We concatenate the standard train and test set provided and split it 80 : 20 for training and testing. Table 1. UCI data sets: negative log predictive density, mean (standard deviation) over 10-fold cross-validation, lower is better. For converting the data sets into a streaming setting, we sort the data on the first dimension and split the data set into 50 subsets for all data sets apart from Mammographic (20 subsets).
Hardware Specification No The paper mentions 'On the same GPU' and 'computational resources provided by the Aalto Science-IT project and CSC IT Center for Science, Finland'. However, it does not specify any particular GPU models, CPU models, or other detailed hardware specifications used for the experiments.
Software Dependencies No The paper mentions 'Tensor Flow' and 'Adam optimizer (Kingma & Ba, 2015)' but does not provide specific version numbers for these or any other software libraries, environments, or tools used, which are necessary for full reproducibility.
Experiment Setup Yes For hyperparameter learning in our proposed model, we use the Adam optimizer (Kingma & Ba, 2015) with learning rate 10-2 for 100 iterations for each set of data. We use 10 latent GPs which matches the number of classes, with 300 inducing variables, and use a softmax likelihood. The number of memory points for each set of tasks is set to 400. For optimization of the hyperparameters, we use the Adam optimizer with learning rate 10-2 for 20,000 iterations.