CLeAR: Continual Learning on Algorithmic Reasoning for Human-like Intelligence

Authors: Bong Gyun Kang, HyunGi Kim, Dahuin Jung, Sungroh Yoon

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conducted extensive experiments consisting of 15 tasks with various levels of Chomsky hierarchy, ranging from in-hierarchy to inter-hierarchy scenarios. CLe AR not only achieved near zero forgetting but also improved accuracy during following tasks, a phenomenon known as backward transfer, while previous CL methods designed for image classification drastically failed.
Researcher Affiliation Academia Bong Gyun Kang1 Hyun Gi Kim2 Dahuin Jung2 Sungroh Yoon1,2 1 Interdisciplinary Program in Artificial Intelligence, Seoul National University 2 Department of Electrical and Computer Engineering, Seoul National University
Pseudocode Yes Algorithm 1 CLe AR-training procedure for a given task t
Open Source Code Yes The code is available at https://github.com/Pusheen-cat/CLe AR_2023
Open Datasets No The paper states 'for fairness, we used the data format proposed in the previous paper [11]' and describes how input sequences were sampled (e.g., 'sampled input sequence length from the uniform distribution U(1, N)'). However, it does not provide concrete access (link, DOI, specific repository) to the specific datasets or data instances used in their experiments.
Dataset Splits No The paper describes training and test data, and how the test set is generated ('OOD dataset'), but it does not mention or specify a separate validation dataset or its splits.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU models, CPU types, or cloud computing instance specifications) used for running the experiments.
Software Dependencies No The paper mentions various model architectures like RNN, LSTM, Stack-RNN, and Tape RNN, but it does not specify any software dependencies (e.g., programming languages, libraries, or frameworks) with version numbers.
Experiment Setup Yes Each task was averaged on three repeats, trained for 50,000 epochs each. The length of the external memory stack and tape in stack RNN and tape RNN was set to 40, and each element had 8 dimensions [11] and dimensions were doubled with CL of more than 5 tasks.