Emergent Symbols through Binding in External Memory
Authors: Taylor Whittington Webb, Ishan Sinha, Jonathan Cohen
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Across a series of tasks, we show that this architecture displays nearly perfect generalization of learned rules to novel entities given only a limited number of training examples, and outperforms a number of other competitive neural network architectures. |
| Researcher Affiliation | Academia | Taylor W. Webb University of California Los Angeles Los Angeles, CA taylor.w.webb@gmail.com Ishan Sinha, Jonathan D. Cohen Princeton University Princeton, NJ |
| Pseudocode | Yes | Algorithm 1: Emergent Symbol Binding Network. |
| Open Source Code | Yes | All code, including code for dataset generation, model implementation, training, and evaluation, is available on Git Hub. |
| Open Datasets | No | For all tasks, we employ the same set of n = 100 images, in which each image is a distinct Unicode character (the speciļ¬c characters used are shown in A.7). ... All code, including code for dataset generation, model implementation, training, and evaluation, is available on Git Hub. - While code for dataset generation is provided, the paper does not provide a direct link or hosting for the generated dataset itself, only the method to create it. |
| Dataset Splits | No | The paper provides details on training and test sets but does not explicitly mention or quantify validation dataset splits. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running the experiments. |
| Software Dependencies | No | The paper mentions using the ADAM optimizer and refers to publicly available code for MNM, but does not specify software dependencies with version numbers (e.g., Python, PyTorch/TensorFlow versions). |
| Experiment Setup | Yes | All models were trained with a batch size of 32 using the ADAM optimizer (Kingma & Ba, 2014). The learning rate for all models trained with TCN was 5e 4. ... Table 5: Learning rates for all models trained without TCN. ... Table 6: Default number of training epochs for all tasks and regimes. |