reproducibilityindex.ai

Model Zoo: A Growing Brain That Learns Continually

Authors: Rahul Ramesh, Pratik Chaudhari

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We use statistical learning theory and experimental analysis to show how multiple tasks can interact with each other in a non-trivial fashion when a single model is trained on them. We demonstrate that Model Zoo obtains large gains in accuracy on a wide variety of continual learning benchmark problems. We comprehensively evaluate Model Zoo on existing task-incremental continual learning benchmark problems and show comparisons with existing methods.
Researcher Affiliation	Academia	Rahul Ramesh & Pratik Chaudhari University of Pennsylvania {rahulram,pratikac}@seas.upenn.edu
Pseudocode	No	The paper describes the Model Zoo algorithm in prose and equations (e.g., Equation 8) but does not provide it in a structured pseudocode or algorithm block format.
Open Source Code	Yes	To ensure the reproducability of our work, the full source code is available at https://github. com/rahul13ramesh/modelzoo_continual.
Open Datasets	Yes	We evaluate on Rotated-MNIST (Lopez-Paz and Ranzato, 2017), Split-MNIST (Zenke et al., 2017), Permuted-MNIST (Kirkpatrick et al., 2017), Split-CIFAR10 (Zenke et al., 2017), Split-CIFAR100 (Zenke et al., 2017), Coarse-CIFAR100 (Rosenbaum et al., 2017; Yoon et al., 2019; Shanahan et al., 2021) and Split-mini Imagenet (Vinyals et al., 2016; Chaudhry et al., 2019b).
Dataset Splits	Yes	Split-mini Imagenet ... 20% of the samples are used as the validation set. We compare algorithms in terms of the validation accuracy averaged across all tasks at the end of all episodes...
Hardware Specification	Yes	All entries for inference time in Table 2 were computed by us on an Nvidia V100 GPU and therefore they can be compared directly with each other.
Software Dependencies	No	Ray Tune (Liaw et al., 2018) was used for hyper-parameter tuning and The Async Successive Halving Algorithm (ASHA) scheduler (Li et al., 2018) was used to prune hyper-parameter choices with the search space determined by Nevergrad (Rapin and Teytaud, 2018). The paper mentions software tools used but does not provide specific version numbers for them (e.g., "Ray Tune X.Y.Z").
Experiment Setup	Yes	The final values of training hyper-parameters that were chosen are, learning-rate of 0.01, mini-batch size of 16, dropout probability of 0.2 and weight-decay of 10 5. Model Zoo uses b = min(k, 5) at each round of continual learning where n is the number of tasks; for tasks with only 5 tasks (MNIST-variants) we use b = 2.