reproducibilityindex.ai

Representational Continuity for Unsupervised Continual Learning

Authors: Divyam Madaan, Jaehong Yoon, Yuanchun Li, Yunxin Liu, Sung Ju Hwang

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct a systematic study analyzing the learned feature representations and show that unsupervised visual representations are surprisingly more robust to catastrophic forgetting, consistently achieve better performance, and generalize better to out-ofdistribution tasks than SCL. Furthermore, we find that UCL achieves a smoother loss landscape through qualitative analysis of the learned representations and learns meaningful feature representations. Additionally, we propose Lifelong Unsupervised Mixup (LUMP), a simple yet effective technique that interpolates between the current task and previous tasks instances to alleviate catastrophic forgetting for unsupervised representations. We release our code online. Table 1 shows the evaluation results for supervised and unsupervised representations learnt by Sim Siam (Chen & He, 2021) across various continual learning strategies.
Researcher Affiliation	Collaboration	Divyam Madaan1 Jaehong Yoon2,3 Yuanchun Li5,6 Yunxin Liu5,6 Sung Ju Hwang2,4 New York University1 KAIST2 Microsoft Research3 AITRICS4 Institute for AI Industry Research (AIR)5 Tsinghua University6 divyam.madaan@nyu.edu, {jaehong.yoon,sjhwang82}@kaist.ac.kr liyuanchun@air.tsinghua.edu.cn, liuyunxin@air.tsinghua.edu.cn
Pseudocode	No	The paper does not contain explicitly labeled pseudocode or algorithm blocks. It provides mathematical formulations and descriptions of methods.
Open Source Code	Yes	We release our code online.
Open Datasets	Yes	Split CIFAR-10 (Krizhevsky, 2012) consists of two random classes out of the ten classes for each task. Split CIFAR-100 (Krizhevsky, 2012) consists of ﬁve random classes out of the 100 classes for each task. Split Tiny-Image Net is a variant of the Image Net dataset (Deng et al., 2009) containing ﬁve random classes out of the 100 classes for each task with the images sized 64 64 pixels.
Dataset Splits	No	The paper does not explicitly provide specific percentages or counts for training, validation, and testing splits. It mentions training on a sequence of tasks and evaluation with KNN classifier.
Hardware Specification	No	The paper mentions using a 'single-head Res Net-18' architecture but does not specify any particular hardware components like CPU or GPU models used for experiments.
Software Dependencies	No	The paper mentions using the 'DER (Buzzega et al., 2020) open-source codebase' and referring to the original implementations for 'Sim Siam' and 'Barlow Twins' but does not specify software versions for programming languages, libraries, or frameworks (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	We follow the hyperparameter setup of Buzzega et al. (2020) for all the SCL strategies and tune them for the UCL representation learning strategies. All the learned representations are evaluated with KNN classiﬁer (Wu et al., 2018) across three independent runs. Further, we use the hyper-parameters obtained by Sim Siam for training UCL strategies with Barlow Twins to analyze the sensitivity of UCL to hyper-parameters and for a fair comparison between different methods. We train all the UCL methods for 200 epochs and evaluate with the KNN classiﬁer (Wu et al., 2018). We provide the hyper-parameters in detail in Table A.5.