Leveraging Attractor Dynamics in Spatial Navigation for Better Language Parsing

Authors: Xiaolong Zou, Xingxing Cao, Xiaojiao Yang, Bo Hong

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our model on language command parsing tasks, specifically using the SCAN dataset. Our findings include: 1) attractor dynamics can facilitate systematic generalization and efficient learning from limited data; 2) through visualization and reverse engineering, we unravel a potential dynamic mechanism for grid network representing syntactic structure.
Researcher Affiliation Industry 1Qiyuan Lab, Beijing, China. Correspondence to: Bo Hong <hongbo@qiyuanlab.com>.
Pseudocode No The paper describes the model architecture and objective function using mathematical equations and prose, but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code No The paper does not provide a direct link to a code repository or explicitly state that the source code for the methodology is publicly available.
Open Datasets Yes Here, we employ the SCAN dataset for model evaluation (Lake, 2019). SCAN is a collection of simple languagedriven navigation tasks designed for studying compositional learning.
Dataset Splits Yes The dataset encompasses 20,910 records, and depending on the split between training and testing data, various SCAN tasks can be created: 1) Simple task: The dataset is randomly divided into the training set and the testing set, with both sets sharing the same distribution. 2) Add primitive task: The training set includes only some basic forms. ... 3) Length task: The training set comprises shorter language commands, while the testing set includes longer commands.
Hardware Specification No The paper does not specify any particular hardware components such as GPU models, CPU types, or memory specifications used for running the experiments. It only mentions general training settings like 'batch size'.
Software Dependencies Yes All models are trained using Pytorch 2.0 (Paszke et al., 2019).
Experiment Setup Yes When training on SCAN tasks, the batch size is set to B = 1. The neural activity constraint coefficients for structure and content representations are set to α1 = α2 = 0.2. ... In the pre-training phase, ... we set the batch size Bpath to 200 and the regularization parameter β to 1 10 4. The training involves 56,000 iterations using the Adam optimizer with an initial learning rate of 0.001. ... Subsequently, in the co-training phase, ... a duration of 10 training epochs. Here, we maintain the initial learning rate at 0.001.