Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Leveraging Attractor Dynamics in Spatial Navigation for Better Language Parsing
Authors: Xiaolong Zou, Xingxing Cao, Xiaojiao Yang, Bo Hong
ICML 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our model on language command parsing tasks, specifically using the SCAN dataset. Our findings include: 1) attractor dynamics can facilitate systematic generalization and efficient learning from limited data; 2) through visualization and reverse engineering, we unravel a potential dynamic mechanism for grid network representing syntactic structure. |
| Researcher Affiliation | Industry | 1Qiyuan Lab, Beijing, China. Correspondence to: Bo Hong <EMAIL>. |
| Pseudocode | No | The paper describes the model architecture and objective function using mathematical equations and prose, but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | No | The paper does not provide a direct link to a code repository or explicitly state that the source code for the methodology is publicly available. |
| Open Datasets | Yes | Here, we employ the SCAN dataset for model evaluation (Lake, 2019). SCAN is a collection of simple languagedriven navigation tasks designed for studying compositional learning. |
| Dataset Splits | Yes | The dataset encompasses 20,910 records, and depending on the split between training and testing data, various SCAN tasks can be created: 1) Simple task: The dataset is randomly divided into the training set and the testing set, with both sets sharing the same distribution. 2) Add primitive task: The training set includes only some basic forms. ... 3) Length task: The training set comprises shorter language commands, while the testing set includes longer commands. |
| Hardware Specification | No | The paper does not specify any particular hardware components such as GPU models, CPU types, or memory specifications used for running the experiments. It only mentions general training settings like 'batch size'. |
| Software Dependencies | Yes | All models are trained using Pytorch 2.0 (Paszke et al., 2019). |
| Experiment Setup | Yes | When training on SCAN tasks, the batch size is set to B = 1. The neural activity constraint coefficients for structure and content representations are set to α1 = α2 = 0.2. ... In the pre-training phase, ... we set the batch size Bpath to 200 and the regularization parameter β to 1 10 4. The training involves 56,000 iterations using the Adam optimizer with an initial learning rate of 0.001. ... Subsequently, in the co-training phase, ... a duration of 10 training epochs. Here, we maintain the initial learning rate at 0.001. |