Emergent Symbols through Binding in External Memory

Authors: Taylor Whittington Webb, Ishan Sinha, Jonathan Cohen

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Across a series of tasks, we show that this architecture displays nearly perfect generalization of learned rules to novel entities given only a limited number of training examples, and outperforms a number of other competitive neural network architectures.
Researcher Affiliation Academia Taylor W. Webb University of California Los Angeles Los Angeles, CA taylor.w.webb@gmail.com Ishan Sinha, Jonathan D. Cohen Princeton University Princeton, NJ
Pseudocode Yes Algorithm 1: Emergent Symbol Binding Network.
Open Source Code Yes All code, including code for dataset generation, model implementation, training, and evaluation, is available on Git Hub.
Open Datasets No For all tasks, we employ the same set of n = 100 images, in which each image is a distinct Unicode character (the specific characters used are shown in A.7). ... All code, including code for dataset generation, model implementation, training, and evaluation, is available on Git Hub. - While code for dataset generation is provided, the paper does not provide a direct link or hosting for the generated dataset itself, only the method to create it.
Dataset Splits No The paper provides details on training and test sets but does not explicitly mention or quantify validation dataset splits.
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running the experiments.
Software Dependencies No The paper mentions using the ADAM optimizer and refers to publicly available code for MNM, but does not specify software dependencies with version numbers (e.g., Python, PyTorch/TensorFlow versions).
Experiment Setup Yes All models were trained with a batch size of 32 using the ADAM optimizer (Kingma & Ba, 2014). The learning rate for all models trained with TCN was 5e 4. ... Table 5: Learning rates for all models trained without TCN. ... Table 6: Default number of training epochs for all tasks and regimes.