Compositionality with Variation Reliably Emerges in Neural Networks
Authors: Henry Conklin, Kenny Smith
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We introduce 4 measures of linguistic variation and show that early in training measures of variation correlate with generalization performance, but that this effect goes away over time as the languages that emerge become regular enough to generalize robustly. Like natural languages, emergent languages appear able to support a high degree of variation while retaining the generalizability we expect from compositionality. In an effort to decrease the variability of emergent languages we show how reducing a model s capacity results in greater regularity, in line with claims about factors shaping the emergence of regularity in human language.1 |
| Researcher Affiliation | Academia | Institute of Language Cognition and Computation, School of Informatics Centre for Language Evolution, School of Philosophy Psychology and Language Sciences , The University of Edinburgh {henry.conklin, kenny.smith}@ed.ac.uk |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code and Data can be found at: github.com/hcoxec/variable_compositionality |
| Open Datasets | Yes | Data is divided into 4 splits for training: 60%, validation 10%, i.i.d. testing 10%, and o.o.d. testing 20%. |
| Dataset Splits | Yes | Data is divided into 4 splits for training: 60%, validation 10%, i.i.d. testing 10%, and o.o.d. testing 20%. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware used to run its experiments, such as GPU or CPU models. |
| Software Dependencies | No | Models are implemented using pytorch (Paszke et al., 2019), and make use of portions of code from the EGG repository (Kharitonov et al., 2019). |
| Experiment Setup | Yes | Full hyperparameters for the experiments presented here can be found in appendix A.10. Recurrent Unit: GRU Hidden Size: 250, 500, 800 Entropy Regularization Coefficient: (sender 0.5, receiver 0.0) Batch Size: 5000 Learning Rate: 1e-3 Signal Length: 6 Character Inventory: 26 Training Epochs: 800 Embedding Size: 52 Optimizer: (Sender: Reinforce, Receiver: ADAM) |