NPCL: Neural Processes for Uncertainty-Aware Continual Learning
Authors: Saurav Jha, Dong Gong, He Zhao, Lina Yao
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments show that the NPCL outperforms previous CL approaches. |
| Researcher Affiliation | Collaboration | Saurav Jha UNSW Sydney saurav.jha@unsw.edu.au Dong Gong UNSW Sydney dong.gong@unsw.edu.au He Zhao CSIRO s Data61 he.zhao@ieee.org Lina Yao CSIRO s Data61, UNSW Sydney lina.yao@data61.csiro.au |
| Pseudocode | No | The paper does not contain explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at https://github.com/srv Codes/NPCL. |
| Open Datasets | Yes | For class-IL, we use three public datasets: sequential CIFAR10 (S-CIFAR-10) [35], sequential CIFAR100 (S-CIFAR-100) [54], and sequential Tiny Image Net (S-Tiny-Image Net) [6]. For domain-IL, we use Permuted MNIST (P-MNIST) [30] and Rotated MNIST (R-MNIST) [35]. |
| Dataset Splits | Yes | We arrive at the best hyperparameter settings for each of our datasets through grid search over a validation set made of 10% of the training set on each dataset. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments, such as exact GPU/CPU models or processor types. |
| Software Dependencies | No | The paper mentions general software components like 'SGD optimizer' and libraries used for tasks such as 'layer normalization' and 'ReLU', but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | We train all models using SGD optimizer. The number of training epochs per task for S-Tiny-Image Net is 100, for S-CIFAR-(10/100) is 50, and that for (P/R)-MNIST is 1. ... we fix the batch sizes for new task s samples and for replay samples to 32 each for the class-IL datasets and to 128 each for the domain-IL datasets. ... NPCL training additionally relies on linearly increasing the learning rate (LR) over a period of 4000 iterations for class-IL and 40 iterations for domain-IL settings. We further apply gradient clipping [40] on L2-norm of the NPCL parameters with a cap of 10000." and "Table 9: Hyperparameters for loss contributions that were tuned on validation sets for each dataset. |