Few-Shot Lifelong Learning
Authors: Pratik Mazumder, Pravendra Singh, Piyush Rai2337-2345
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experimentally show that our method significantly outperforms existing methods on the mini Image Net, CIFAR-100, and CUB-200 datasets. Specifically, we outperform the state-of-the-art method by an absolute margin of 19.27% for the CUB dataset. |
| Researcher Affiliation | Collaboration | Pratik Mazumder*1, Pravendra Singh 2, Piyush Rai1 1 Department of Computer Science and Engineering, IIT Kanpur, India 2 Independent Researcher, India |
| Pseudocode | No | The paper describes the proposed method in text and with mathematical equations but does not include any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement about releasing source code or a link to a code repository. |
| Open Datasets | Yes | We perform experiments in the FSCIL setting using three image classification datasets CIFAR-100 (Krizhevsky and Hinton 2009), mini Image Net (Vinyals et al. 2016) and CUB200 (Wah et al. 2011). |
| Dataset Splits | No | The paper describes the construction of training and test sets but does not explicitly mention a distinct validation set or its split details for hyperparameter tuning or early stopping. |
| Hardware Specification | No | The paper mentions using a "Res Net-18 architecture" for experiments, which is a model architecture, but does not provide any specific details about the hardware used (e.g., GPU models, CPU types, memory). |
| Software Dependencies | No | The paper mentions implementing the method but does not list any specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | We train Θ(1) F , and Θ(1) C on the base training set D(1) with an initial learning rate of 0.1 and mini-batch size of 128. After the 30 and 40 epochs, we reduce the learning rate to 0.01 and 0.001, respectively. We train on D(1) for a total of 50 epochs and then discard Θ(1) C . We finetune the feature extractor on each of the few-shot training sets D(t>1) for 30 epochs, with a learning rate of 1e-4 (and 1e-3 for CUB-200). |