Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Learn the Time to Learn: Replay Scheduling in Continual Learning
Authors: Marcus Klasson, Hedvig Kjellstrom, Cheng Zhang
TMLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we present the experimental results to show the importance of replay scheduling in CL. First, we demonstrate the benefits with replay scheduling by using MCTS for finding replay schedules in Section 4.1. Thereafter, we evaluate our RL-based framework using DQN (Mnih et al., 2013) and A2C (Mnih et al., 2016) for learning policies that generalize to new CL scenarios in Section 4.2. Full details on experimental settings and additional results are in Appendix C and D. We conduct experiments on several CL benchmark datasets: Split MNIST (Le Cun et al., 1998; Zenke et al., 2017), Split Fashion MNIST (Xiao et al., 2017), Split not MNIST (Bulatov, 2011), Permuted MNIST (Goodfellow et al., 2013), and Split CIFAR-100 (Krizhevsky & Hinton, 2009), and Split mini Imagenet (Vinyals et al., 2016). |
| Researcher Affiliation | Collaboration | Marcus Klasson EMAIL Aalto University Hedvig Kjellström EMAIL KTH Royal Institute of Technology Cheng Zhang EMAIL Microsoft Research |
| Pseudocode | Yes | A Additional Methodology In this section, we provide pseudo-code for MCTS to search for replay schedules in single CL environments in Section A.1 as well as pseudo-code for the RL-based framework for learning the replay scheduling policies in Section A.2. Algorithm 1 Monte Carlo Tree Search for Replay Scheduling Algorithm 2 RL Framework for Learning Replay Scheduling Policy |
| Open Source Code | Yes | Code is publicly available under the MIT license1. 1Code: https://github.com/marcusklasson/replay_scheduling |
| Open Datasets | Yes | We conduct experiments on several CL benchmark datasets: Split MNIST (Le Cun et al., 1998; Zenke et al., 2017), Split Fashion MNIST (Xiao et al., 2017), Split not MNIST (Bulatov, 2011), Permuted MNIST (Goodfellow et al., 2013), and Split CIFAR-100 (Krizhevsky & Hinton, 2009), and Split mini Imagenet (Vinyals et al., 2016). |
| Dataset Splits | Yes | We let the network fϕ, parameterized by ϕ, learn T tasks sequentially from the datasets D1, . . . , DT arriving one at a time. The t-th dataset Dt = {(x(i) t , y(i) t )}Nt i=1 consists of Nt samples where x(i) t and y(i) t are the i-th data point and class label respectively. Furthermore, each dataset is split into a training, validation, and test set, i.e., Dt = {D(train) t , D(val) t , D(test) t }. MCTS and Heur-GD randomly sample 15% of the training data of each task to use for validation. |
| Hardware Specification | Yes | All experiments were performed on one NVIDIA Ge Force RTW 2080Ti on an internal GPU cluster. |
| Software Dependencies | No | The code for DQN was adapted from Open AI baselines (Dhariwal et al., 2017) and the Py Torch (Paszke et al., 2019) tutorial on DQN https://pytorch.org/tutorials/intermediate/reinforcement_q_learning.html. For A2C, we followed the implementations released by Kostrikov (2018) and Igl et al. (2021). Explanation: The paper mentions software libraries like PyTorch and refers to other implementations, but does not provide specific version numbers for these software dependencies, which is required for reproducibility. |
| Experiment Setup | Yes | CL Hyperparameters. We train all networks with the Adam optimizer (Kingma & Ba, 2015) with learning rate η = 0.001 and hyperparameters β1 = 0.9 and β2 = 0.999. Note that the learning rate for Adam is not reset before training on a new task. Next, we give details on number of training epochs and batch sizes specific for each dataset: Split MNIST: 10 epochs/task, batch size 128. Split Fashion MNIST: 30 epochs/task, batch size 128. Split not MNIST: 50 epochs/task, batch size 128. Permuted MNIST: 20 epochs/task, batch size 128. Split CIFAR-100: 25 epochs/task, batch size 256. Split mini Imagenet: 1 epoch/task (task 1 trained for 5 epochs as warm up), batch size 32. |