Leveraging Attractor Dynamics in Spatial Navigation for Better Language Parsing
Authors: Xiaolong Zou, Xingxing Cao, Xiaojiao Yang, Bo Hong
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our model on language command parsing tasks, specifically using the SCAN dataset. Our findings include: 1) attractor dynamics can facilitate systematic generalization and efficient learning from limited data; 2) through visualization and reverse engineering, we unravel a potential dynamic mechanism for grid network representing syntactic structure. |
| Researcher Affiliation | Industry | 1Qiyuan Lab, Beijing, China. Correspondence to: Bo Hong <hongbo@qiyuanlab.com>. |
| Pseudocode | No | The paper describes the model architecture and objective function using mathematical equations and prose, but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | No | The paper does not provide a direct link to a code repository or explicitly state that the source code for the methodology is publicly available. |
| Open Datasets | Yes | Here, we employ the SCAN dataset for model evaluation (Lake, 2019). SCAN is a collection of simple languagedriven navigation tasks designed for studying compositional learning. |
| Dataset Splits | Yes | The dataset encompasses 20,910 records, and depending on the split between training and testing data, various SCAN tasks can be created: 1) Simple task: The dataset is randomly divided into the training set and the testing set, with both sets sharing the same distribution. 2) Add primitive task: The training set includes only some basic forms. ... 3) Length task: The training set comprises shorter language commands, while the testing set includes longer commands. |
| Hardware Specification | No | The paper does not specify any particular hardware components such as GPU models, CPU types, or memory specifications used for running the experiments. It only mentions general training settings like 'batch size'. |
| Software Dependencies | Yes | All models are trained using Pytorch 2.0 (Paszke et al., 2019). |
| Experiment Setup | Yes | When training on SCAN tasks, the batch size is set to B = 1. The neural activity constraint coefficients for structure and content representations are set to α1 = α2 = 0.2. ... In the pre-training phase, ... we set the batch size Bpath to 200 and the regularization parameter β to 1 10 4. The training involves 56,000 iterations using the Adam optimizer with an initial learning rate of 0.001. ... Subsequently, in the co-training phase, ... a duration of 10 training epochs. Here, we maintain the initial learning rate at 0.001. |