Rapid Learning without Catastrophic Forgetting in the Morris Water Maze
Authors: Raymond Wang, Jaedong Hwang, Akhilan Boopathy, Ila R Fiete
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our model outperforms ANN baselines from continual learning contexts applied to the task. Our findings demonstrate a significant advantage of our neural-inspired method over general state-of-the-art continual learning algorithms in the sequential Morris Water Maze task. Our method achieves significantly higher per-formance than baseline methods in standard continual learning. 6. Experiments |
| Researcher Affiliation | Academia | 1Massachusetts Institute of Technology. Correspondence to: Raymond L Wang <rlwang@mit.edu>. |
| Pseudocode | Yes | In our appendix, Algorithm 1 illustrates the pseudocode of our training loop, while Algorithm 2 illustrates how our agent is updated, and Algorithm 3 illustrates how we use Vector-Ha SH. A.7. Algorithm Pseudocode |
| Open Source Code | Yes | Our code is available at: https://github.com/raymondw2/seqwm |
| Open Datasets | No | No concrete access information for a publicly available or open dataset. The paper describes a custom variant of a task: 'We have developed a variant of the Morris Water Maze task called the sequential Morris Water Maze (s WM).' |
| Dataset Splits | No | No specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) is provided. The paper describes training and evaluation on different environments. |
| Hardware Specification | Yes | We conducted our experiments on a high-performance computing system. The system was equipped with an AMD EPYC 7713 64-Core Processor, 32 GB of RAM and one Nvidia RTX 2080 Ti GPU. |
| Software Dependencies | No | No specific ancillary software details with version numbers are provided. It mentions using 'Adam' for optimization and references several public continual learning implementations and algorithms, but without versioning for the software itself. |
| Experiment Setup | Yes | We optimize parameters using Adam (Kingma & Ba, 2015) with a learning rate of 0.001 for 800 episodes for each environment. The maximum number of steps in each episode is set to 100 and the starting configuration (head direction and coordinates) are different. The environment is a 30 30 grid with unique, noise-added step function markings on the walls. The agent has a field of view (FOV) of 120 degrees (see Figure 1a). For each epoch, the network is trained on 200 trajectories. Each trajectory was limited to a maximum of 100 time steps... In methods utilizing a buffer, we set its capacity to 200. For the DER++ algorithm, we adhered to the optimal parameters recommended in the paper: α and β both set at 0.5. All tested methods employed Cross Entropy loss and a learning rate of 0.001. |