reproducibilityindex.ai

Compositional Language Continual Learning

Authors: Yuanpeng Li, Liang Zhao, Kenneth Church, Mohamed Elhoseiny

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show that the proposed method has signiﬁcant improvement over state-of-the-art methods. It enables knowledge transfer and prevents catastrophic forgetting, resulting in more than 85% accuracy up to 100 stages, compared with less than 50% accuracy for baselines in instruction learning task. It also shows signiﬁcant improvement in machine translation task.
Researcher Affiliation	Collaboration	Yuanpeng Li , Liang Zhao, Kenneth Church Baidu Research Mohamed Elhoseiny KAUST, Stanford University
Pseudocode	No	The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	The code is available at: https://github.com/yli1/CLCL.
Open Datasets	Yes	We extend the grammar in SCAN dataset (Lake & Baroni, 2017) to generate data. Machine translation dataset is generated similarly from the translation dataset in SCAN dataset (Lake & Baroni, 2017).
Dataset Splits	Yes	We use one set for initial stage training data (6,601 samples), and reserve the other set as initial dataset to evaluate catastrophic forgetting in continual stages (Forget, 6,602 samples). The reserved data is also used to evaluate long-term catastrophic forgetting (Long-forget). We then add Transfer to Forget for the next stage. Machine translation dataset is generated similarly from the translation dataset in SCAN dataset (Lake & Baroni, 2017). We use the original training data as the initial training data, and the original test data as the initial test data.
Hardware Specification	No	The paper mentions implementing methods with TensorFlow but does not specify any hardware details such as GPU models, CPU types, or memory.
Software Dependencies	No	The paper mentions using 'TensorFlow (Abadi et al., 2016)' but does not provide a specific version number for TensorFlow or any other software dependencies.
Experiment Setup	Yes	The state size is h = 32 for encoder, and 2h = 64 for decoder. We also use kp = 64, kf = 8, α = 0.1. For EWC (Kirkpatrick et al., 2017a) and MAS (Aljundi et al., 2018), we use 10 for parameter regularization weight. In initial stage, batch size is 512, and we run training 5,000 steps. In each continual stage, batch size is 1, as each continual stage only contains one sample, and we run training 1,000 steps. We have 100 continual stages.