Dark Experience for General Continual Learning: a Strong, Simple Baseline
Authors: Pietro Buzzega, Matteo Boschini, Angelo Porrello, Davide Abati, SIMONE CALDERARA
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | By conducting an extensive analysis on both standard benchmarks and a novel GCL evaluation setting (MNIST-360), we show that such a seemingly simple baseline outperforms consolidated approaches and leverages limited resources. |
| Researcher Affiliation | Academia | AImage Lab University of Modena and Reggio Emilia, Modena, Italy name.surname@unimore.it |
| Pseudocode | Yes | Algorithm 1 Dark Experience Replay Input: dataset D, parameters θ, scalar α, learning rate λ M {} for (x, y) in D do (x , z , y ) sample(M) xt augment(x) x t augment(x ) z hθ(xt) reg α z hθ(x t) 2 2 θ θ + λ θ[ℓ(y, fθ(xt)) + reg] M reservoir(M, (x, z)) end for |
| Open Source Code | Yes | Code is available at https://github.com/aimagelab/mammoth. |
| Open Datasets | Yes | In practice, we follow [10, 42] by splitting CIFAR-10 [21] and Tiny Image Net [38] in 5 and 10 tasks, each of which introduces 2 and 20 classes respectively. We show all the classes in the same fixed order across different runs. For this setting, we leverage two common protocols built upon the MNIST dataset [23], namely Permuted MNIST [20] and Rotated MNIST [27]. |
| Dataset Splits | Yes | We select hyperparameters by performing a grid-search on a validation set, the latter obtained by sampling 10% of the training set. |
| Hardware Specification | Yes | We conduct all tests under the same conditions, running each benchmark on a Desktop Computer equipped with an NVIDIA Titan X GPU and an Intel i7-6850K CPU. |
| Software Dependencies | No | The paper mentions using a 'Stochastic Gradient Descent (SGD) optimizer' and 'ResNet18' but does not specify programming language versions or library versions with numbers, such as Python 3.x or PyTorch 1.x. |
| Experiment Setup | Yes | For MNIST-based settings, one epoch per task is sufficient. Conversely, we increase the number of epochs to 50 for Sequential CIFAR-10 and 100 for Sequential Tiny Image Net respectively... We select hyperparameters by performing a grid-search on a validation set... setting them both to 0.5 yields stable performance. |