Multi-Domain Multi-Task Rehearsal for Lifelong Learning
Authors: Fan Lyu, Shuai Wang, Wei Feng, Zihan Ye, Fuyuan Hu, Song Wang8819-8827
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on benchmark datasets validate the proposed approach can effectively mitigate the unpredictable domain shift. and We evaluate our MDMT rehearsal on four popular lifelong learning datasets for image classification and achieve new state-of-the-art performance. |
| Researcher Affiliation | Academia | 1Colledge of Intelligence and Computing, Tianjin University 2School of Electronic & Information Engineering, Suzhou University of Science and Technology 3Department of Computer Science and Engineering, University of South Carolina {fanlyu, wangshuai201909, wfeng}@tju.edu.cn, {zihanye@post, fuyuanhu@mail}.usts.edu.cn, songwang@cec.sc.edu |
| Pseudocode | Yes | Algorithm 1 MDMT rehearsal based lifelong learning. Procedure TRAIN(fθ, fθ1:T , {Dtrn 1 , , Dtrn T }) M, F {}, {} for t = 1 to T do for (x, y) Dtrn t do g, g1 θℓ(fθ(x, t), y) if t = 1 then g g else gref, g1:t 1 θℓ(fθ, fθ1:t 1, M) gref gref + θ ℓ(fθ, Fref) g g + gref end if θ θ Step Size g θ1:t θ1:t Step Size g1:t end for M, F STOREMEM(M, F, Dtrn t , fθ) end for Procedure STOREMEM(M, F, D, f) for i = 1 to |M|/T do (x, y) D M M + (x, y) F F + f(x) end for Return M, F Procedure EVAL(fθ, fθ1:T , {Dtst 1 , , Dtst T }) a 0 RT for t = 1 to T do at 0 for (x, y) Dtst t do at at + Accuracy(fθt(fθ(x, t)), y) end for at at/|Dtst t | end for Return a |
| Open Source Code | No | The paper does not provide any explicit statement or link for open-source code for the described methodology. |
| Open Datasets | Yes | We evaluate the proposed method on four image recognition datasets. (1) Permuted MNIST. (Kirkpatrick et al. 2017)... (2) Split CIFAR. (Zenke, Poole, and Ganguli 2017): this dataset consists of 20 disjoint subsets of CIFAR100 dataset (Krizhevsky, Hinton et al. 2009)... (3) Split CUB. (Chaudhry et al. 2018b): the CUB dataset (Wah et al. 2011) is split into 20 disjoint subsets... (4) Split AWA. (Chaudhry et al. 2018b): this dataset consists of 20 subsets of the AWA dataset (Lampert, Nickisch, and Harmeling 2009). |
| Dataset Splits | No | For the t-th dataset (task), Dt = {(xt,1, yt,1), , (xt,Nt, yt,Nt)}, where xt,i Xt is the ith input data, yt,i Yt is the corresponding label and Nt is the number of samples. Dt can be split into a training set Dtrn t and a testing set Dtst t , and we denote Dt as Dtrn t in our presentation for simple denotation. |
| Hardware Specification | No | No specific hardware details (like GPU/CPU models, memory, etc.) are mentioned for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | No | Following the previous works (Lopez-Paz and Ranzato 2017; Chaudhry et al. 2018b; Guo et al. 2019), for Permuted MNIST we adopt a standard fully-connected network with two hidden layers, where each layer has 256 units with Re LU activation. For Split CIFAR we use a reduced Res Net18 (He et al. 2016). For Split CUB and Split AWA, we use a standard Res Net18. |