Random Path Selection for Continual Learning

Authors: Jathushan Rajasegaran, Munawar Hayat, Salman H. Khan, Fahad Shahbaz Khan, Ling Shao

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through extensive experiments, we demonstrate that the proposed method surpasses the state-of-the-art performance on incremental learning and by utilizing parallel computation this method can run in constant time with nearly the same efficiency as a conventional deep convolutional neural network. Also, Section 4 is titled "Experiments and Results" and contains detailed experimental evaluation including comparisons, ablation studies, and performance metrics.
Researcher Affiliation Industry Inception Institute of Artificial Intelligence first.last@inceptioniai.org
Pseudocode No The paper does not contain any structured pseudocode or clearly labeled algorithm blocks.
Open Source Code Yes Codes available at https://github.com/brjathu/RPSnet
Open Datasets Yes For our experiments, we use evaluation protocols similar to i CARL [21]. We incrementally learn 100 classes on CIFAR-100 in groups of 10, 20 and 50 at a time. For Image Net, we use the same subset as [21] comprising of 100 classes and incrementally learn them in groups of 10. ... We also experiment our model with MNIST and SVHN datasets.
Dataset Splits No The paper mentions using 'evaluation protocols similar to i CARL [21]' and evaluating on 'test samples of all seen classes', but it does not explicitly state specific percentages or counts for a distinct validation dataset split. The information provided is insufficient to confirm a validation split.
Hardware Specification Yes For each task, we train N = 8 models in parallel using a NVIDIA-DGX-1 machine.
Software Dependencies No The paper mentions using 'Adam [14]' for optimization but does not provide specific version numbers for any software libraries or dependencies, only the general optimizer name.
Experiment Setup Yes For each task, we train our model for 100 epochs using Adam [14] with te = 2, with learning rate starting from 10 3 and divided by 2 after every 20 epochs. We set the controller s scaling factor to γ = 2.5 and γ = 10 respectively for CIFAR and Image Net datasets. We fix M = 8 and J = 2 except for the 50 classes per task, where J = 1. We do not use any weight or network regularization scheme such as dropout in our model. For augmentation, training images are randomly cropped, flipped and rotated (< 100).