CLeAR: Continual Learning on Algorithmic Reasoning for Human-like Intelligence
Authors: Bong Gyun Kang, HyunGi Kim, Dahuin Jung, Sungroh Yoon
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conducted extensive experiments consisting of 15 tasks with various levels of Chomsky hierarchy, ranging from in-hierarchy to inter-hierarchy scenarios. CLe AR not only achieved near zero forgetting but also improved accuracy during following tasks, a phenomenon known as backward transfer, while previous CL methods designed for image classification drastically failed. |
| Researcher Affiliation | Academia | Bong Gyun Kang1 Hyun Gi Kim2 Dahuin Jung2 Sungroh Yoon1,2 1 Interdisciplinary Program in Artificial Intelligence, Seoul National University 2 Department of Electrical and Computer Engineering, Seoul National University |
| Pseudocode | Yes | Algorithm 1 CLe AR-training procedure for a given task t |
| Open Source Code | Yes | The code is available at https://github.com/Pusheen-cat/CLe AR_2023 |
| Open Datasets | No | The paper states 'for fairness, we used the data format proposed in the previous paper [11]' and describes how input sequences were sampled (e.g., 'sampled input sequence length from the uniform distribution U(1, N)'). However, it does not provide concrete access (link, DOI, specific repository) to the specific datasets or data instances used in their experiments. |
| Dataset Splits | No | The paper describes training and test data, and how the test set is generated ('OOD dataset'), but it does not mention or specify a separate validation dataset or its splits. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU models, CPU types, or cloud computing instance specifications) used for running the experiments. |
| Software Dependencies | No | The paper mentions various model architectures like RNN, LSTM, Stack-RNN, and Tape RNN, but it does not specify any software dependencies (e.g., programming languages, libraries, or frameworks) with version numbers. |
| Experiment Setup | Yes | Each task was averaged on three repeats, trained for 50,000 epochs each. The length of the external memory stack and tape in stack RNN and tape RNN was set to 40, and each element had 8 dimensions [11] and dimensions were doubled with CL of more than 5 tasks. |