Knowledge Consolidation based Class Incremental Online Learning with Limited Data

Authors: Mohammed Asad Karim, Vinay Kumar Verma, Pravendra Singh, Vinay Namboodiri, Piyush Rai

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our approach via extensive experiments across various datasets. We follow the evaluation protocol where the model is updated in an online fashion and later evaluated on the unseen data (Section 2). We compare the performance of our model (KCCIOL) against several baselines.
Researcher Affiliation Academia 1Indian Institute of Technology Kanpur, India 2Duke University, United States 3Indian Institute of Technology Roorkee, India 4University of Bath, United Kingdom
Pseudocode Yes Algorithm 1 Training Algorithm, Algorithm 2 KCCIOL, Algorithm 3 Mask Calculation, Algorithm 4 Evaluation Protocol
Open Source Code No The paper does not provide any explicit statements about releasing source code or links to a code repository for their methodology.
Open Datasets Yes The Omniglot dataset [Lake et al., 2015] contains 1623 classes of different handwritten characters from 50 different alphabets. [...] Vinyals et al. [Vinyals et al., 2016] proposed the mini-imagenet dataset, which is a subset of the imagenet dataset.
Dataset Splits Yes The first 963 classes constitute the (Xtrain, Ytrain) and the remaining classes are used as (Xtest, Ytest). For learning trajectory during training, τtr consists of 10 samples from a class randomly sampled from the training set. τval consists of 10+1 samples where ten samples are randomly sampled from the train set, and the 11th sample belongs to the class used in τtr. [...] We use 15/5 samples per class for τtr/τval during evaluation. [...] We use 64 classes for training and 20 classes for testing. For learning trajectory during training, τtr consists of 10 samples from a class randomly sampled from the training set. τval consists of 15 samples where 10 samples are randomly sampled from the training set, and 5 samples belong to the class used in τtr.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions 'Adam optimizer' and 'Re LU activation function' but does not specify version numbers for any software dependencies or libraries (e.g., Python, PyTorch, TensorFlow).
Experiment Setup Yes Hyperparameter Settings: We train our model using hyperparameters: β1 = 1e-4, α1 = 1e-2, steps1 = 20000, β2 = 1e-4, α2 = 1e-2 , γ =5e-5, steps2 = 15000, β3 = 1e-4, α3 = 1e-2 , λ =5e-4, steps3 = 4000, δ = 0.5. Model Architecture: We use six convolutional layers followed by two fully connected layers, and each convolutional layer contains 256 filters of 3 3 kernel size with (2, 1, 2, 1, 2, 2) strides (same as used in [Javed and White, 2019]). Re LU activation function is used for the non-linearity.