Semantics and Spatiality of Emergent Communication
Authors: Rotem Ben Zion, Boaz Carmeli, Orr Paradise, Yonatan Belinkov
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments with emergent communication games validate our theoretical results. These findings demonstrate an inherent advantage of distance-based communication goals, and contextualize previous empirical discoveries. To support our findings, we run experiments on Shapes [27] and MNIST [30]. |
| Researcher Affiliation | Academia | 1 Technion Israel Institute of Technology 2 UC Berkeley |
| Pseudocode | Yes | Algorithm 1 Compute message variance |
| Open Source Code | Yes | Our code is available at https://github.com/Rotem-BZ/SemanticConsistency. |
| Open Datasets | Yes | To support our theoretical findings, we run experiments on two datasets: (i) the MNIST dataset [30] contains images of a single hand-written digit; (ii) the Shapes dataset [27] contains images of an object with random shape, color and position. |
| Dataset Splits | Yes | We generate 10K data samples, split into 9088 (142 batches of 64) training samples and 896 (14 batches of 64) validation samples. The MNIST dataset [30] contains (1, 28, 28) images of hand-written digits, split into 54K training samples and 5440 validation samples. |
| Hardware Specification | Yes | The main training stage (EC) takes 4 hours on Shapes, 1.5 hours on MNIST using NVIDIA Ge Force RTX 2080 Ti. |
| Software Dependencies | No | The paper mentions Gumbel-Softmax, GRU, MLP, and transposeconvolutions as components, but does not specify version numbers for these or other software libraries (e.g., PyTorch, TensorFlow) or Python. |
| Experiment Setup | Yes | We train (during the EC phase) for 200/100 epochs with batch size 64. We use vocabulary size 10 and message length 4. Every message is full length as we do not use a special end-of-sentence token. We use a fixed temperature value of 1.0 for the Gumbel-Softmax sampling method. |