NPCL: Neural Processes for Uncertainty-Aware Continual Learning

Authors: Saurav Jha, Dong Gong, He Zhao, Lina Yao

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments show that the NPCL outperforms previous CL approaches.
Researcher Affiliation Collaboration Saurav Jha UNSW Sydney saurav.jha@unsw.edu.au Dong Gong UNSW Sydney dong.gong@unsw.edu.au He Zhao CSIRO s Data61 he.zhao@ieee.org Lina Yao CSIRO s Data61, UNSW Sydney lina.yao@data61.csiro.au
Pseudocode No The paper does not contain explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Code is available at https://github.com/srv Codes/NPCL.
Open Datasets Yes For class-IL, we use three public datasets: sequential CIFAR10 (S-CIFAR-10) [35], sequential CIFAR100 (S-CIFAR-100) [54], and sequential Tiny Image Net (S-Tiny-Image Net) [6]. For domain-IL, we use Permuted MNIST (P-MNIST) [30] and Rotated MNIST (R-MNIST) [35].
Dataset Splits Yes We arrive at the best hyperparameter settings for each of our datasets through grid search over a validation set made of 10% of the training set on each dataset.
Hardware Specification No The paper does not provide specific details about the hardware used for running experiments, such as exact GPU/CPU models or processor types.
Software Dependencies No The paper mentions general software components like 'SGD optimizer' and libraries used for tasks such as 'layer normalization' and 'ReLU', but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes We train all models using SGD optimizer. The number of training epochs per task for S-Tiny-Image Net is 100, for S-CIFAR-(10/100) is 50, and that for (P/R)-MNIST is 1. ... we fix the batch sizes for new task s samples and for replay samples to 32 each for the class-IL datasets and to 128 each for the domain-IL datasets. ... NPCL training additionally relies on linearly increasing the learning rate (LR) over a period of 4000 iterations for class-IL and 40 iterations for domain-IL settings. We further apply gradient clipping [40] on L2-norm of the NPCL parameters with a cap of 10000." and "Table 9: Hyperparameters for loss contributions that were tuned on validation sets for each dataset.