TRGP: Trust Region Gradient Projection for Continual Learning

Authors: Sen Lin, Li Yang, Deliang Fan, Junshan Zhang

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments show that our approach achieves significant improvement over related state-of-the-art methods. and 5 EXPERIMENTAL RESULTS
Researcher Affiliation Academia 1School of ECEE, Arizona State University, 2Department of ECE, University of California, Davis
Pseudocode No No explicitly labeled pseudocode or algorithm blocks were found in the paper.
Open Source Code Yes For the experimental results presented in the main text, we include the code in the supplemental material, and specify all the training details in Section 5.1 and Appendix A.
Open Datasets Yes 1) PMNIST. Following (Lopez-Paz & Ranzato, 2017; Saha et al., 2021)... 2) CIFAR-100 Split. We split the classes of CIFAR-100 (Krizhevsky et al., 2009)... 3) CIFAR-100 Sup. We divide the CIFAR-100 dataset... 4) 5-Datasets. We use a sequence of 5-Datasets which includes CIFAR-10, MNIST, SVHN (Netzer et al., 2011), not-MNIST (Bulatov, 2011) and Fashion MNIST (Xiao et al., 2017)... 5) Mini Image Net Split. We split the 100 classes of Mini Image Net (Vinyals et al., 2016)...
Dataset Splits Yes and train each task for maximum of 200 epochs with the early termination strategy based on the validation loss value. and Note that the y-axis is the validation accuracy with a split validate dataset that used during training, by following the setup in (Saha et al., 2021).
Hardware Specification No The paper does not provide specific hardware details such as GPU or CPU models used for the experiments.
Software Dependencies No The paper does not provide specific version numbers for software dependencies or libraries used in the experiments.
Experiment Setup Yes We use a 3-layer fully-connected network. with two hidden layer of 100 units. and train the network for 5 epochs with batch size of 10 for each task. and The batch size is set to 64. and maximum of 200 epochs with the early termination strategy. and The threshold ϵl is set to 0.5, and we select top-2 tasks that satisfy condition Eq. (3).