TRGP: Trust Region Gradient Projection for Continual Learning
Authors: Sen Lin, Li Yang, Deliang Fan, Junshan Zhang
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments show that our approach achieves significant improvement over related state-of-the-art methods. and 5 EXPERIMENTAL RESULTS |
| Researcher Affiliation | Academia | 1School of ECEE, Arizona State University, 2Department of ECE, University of California, Davis |
| Pseudocode | No | No explicitly labeled pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | Yes | For the experimental results presented in the main text, we include the code in the supplemental material, and specify all the training details in Section 5.1 and Appendix A. |
| Open Datasets | Yes | 1) PMNIST. Following (Lopez-Paz & Ranzato, 2017; Saha et al., 2021)... 2) CIFAR-100 Split. We split the classes of CIFAR-100 (Krizhevsky et al., 2009)... 3) CIFAR-100 Sup. We divide the CIFAR-100 dataset... 4) 5-Datasets. We use a sequence of 5-Datasets which includes CIFAR-10, MNIST, SVHN (Netzer et al., 2011), not-MNIST (Bulatov, 2011) and Fashion MNIST (Xiao et al., 2017)... 5) Mini Image Net Split. We split the 100 classes of Mini Image Net (Vinyals et al., 2016)... |
| Dataset Splits | Yes | and train each task for maximum of 200 epochs with the early termination strategy based on the validation loss value. and Note that the y-axis is the validation accuracy with a split validate dataset that used during training, by following the setup in (Saha et al., 2021). |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU or CPU models used for the experiments. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies or libraries used in the experiments. |
| Experiment Setup | Yes | We use a 3-layer fully-connected network. with two hidden layer of 100 units. and train the network for 5 epochs with batch size of 10 for each task. and The batch size is set to 64. and maximum of 200 epochs with the early termination strategy. and The threshold ϵl is set to 0.5, and we select top-2 tasks that satisfy condition Eq. (3). |