Rapid Learning without Catastrophic Forgetting in the Morris Water Maze

Authors: Raymond Wang, Jaedong Hwang, Akhilan Boopathy, Ila R Fiete

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our model outperforms ANN baselines from continual learning contexts applied to the task. Our findings demonstrate a significant advantage of our neural-inspired method over general state-of-the-art continual learning algorithms in the sequential Morris Water Maze task. Our method achieves significantly higher per-formance than baseline methods in standard continual learning. 6. Experiments
Researcher Affiliation Academia 1Massachusetts Institute of Technology. Correspondence to: Raymond L Wang <rlwang@mit.edu>.
Pseudocode Yes In our appendix, Algorithm 1 illustrates the pseudocode of our training loop, while Algorithm 2 illustrates how our agent is updated, and Algorithm 3 illustrates how we use Vector-Ha SH. A.7. Algorithm Pseudocode
Open Source Code Yes Our code is available at: https://github.com/raymondw2/seqwm
Open Datasets No No concrete access information for a publicly available or open dataset. The paper describes a custom variant of a task: 'We have developed a variant of the Morris Water Maze task called the sequential Morris Water Maze (s WM).'
Dataset Splits No No specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) is provided. The paper describes training and evaluation on different environments.
Hardware Specification Yes We conducted our experiments on a high-performance computing system. The system was equipped with an AMD EPYC 7713 64-Core Processor, 32 GB of RAM and one Nvidia RTX 2080 Ti GPU.
Software Dependencies No No specific ancillary software details with version numbers are provided. It mentions using 'Adam' for optimization and references several public continual learning implementations and algorithms, but without versioning for the software itself.
Experiment Setup Yes We optimize parameters using Adam (Kingma & Ba, 2015) with a learning rate of 0.001 for 800 episodes for each environment. The maximum number of steps in each episode is set to 100 and the starting configuration (head direction and coordinates) are different. The environment is a 30 30 grid with unique, noise-added step function markings on the walls. The agent has a field of view (FOV) of 120 degrees (see Figure 1a). For each epoch, the network is trained on 200 trajectories. Each trajectory was limited to a maximum of 100 time steps... In methods utilizing a buffer, we set its capacity to 200. For the DER++ algorithm, we adhered to the optimal parameters recommended in the paper: α and β both set at 0.5. All tested methods employed Cross Entropy loss and a learning rate of 0.001.