Compositionality with Variation Reliably Emerges in Neural Networks

Authors: Henry Conklin, Kenny Smith

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We introduce 4 measures of linguistic variation and show that early in training measures of variation correlate with generalization performance, but that this effect goes away over time as the languages that emerge become regular enough to generalize robustly. Like natural languages, emergent languages appear able to support a high degree of variation while retaining the generalizability we expect from compositionality. In an effort to decrease the variability of emergent languages we show how reducing a model s capacity results in greater regularity, in line with claims about factors shaping the emergence of regularity in human language.1
Researcher Affiliation Academia Institute of Language Cognition and Computation, School of Informatics Centre for Language Evolution, School of Philosophy Psychology and Language Sciences , The University of Edinburgh {henry.conklin, kenny.smith}@ed.ac.uk
Pseudocode No The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes Code and Data can be found at: github.com/hcoxec/variable_compositionality
Open Datasets Yes Data is divided into 4 splits for training: 60%, validation 10%, i.i.d. testing 10%, and o.o.d. testing 20%.
Dataset Splits Yes Data is divided into 4 splits for training: 60%, validation 10%, i.i.d. testing 10%, and o.o.d. testing 20%.
Hardware Specification No The paper does not explicitly describe the specific hardware used to run its experiments, such as GPU or CPU models.
Software Dependencies No Models are implemented using pytorch (Paszke et al., 2019), and make use of portions of code from the EGG repository (Kharitonov et al., 2019).
Experiment Setup Yes Full hyperparameters for the experiments presented here can be found in appendix A.10. Recurrent Unit: GRU Hidden Size: 250, 500, 800 Entropy Regularization Coefficient: (sender 0.5, receiver 0.0) Batch Size: 5000 Learning Rate: 1e-3 Signal Length: 6 Character Inventory: 26 Training Epochs: 800 Embedding Size: 52 Optimizer: (Sender: Reinforce, Receiver: ADAM)