Semantics and Spatiality of Emergent Communication

Authors: Rotem Ben Zion, Boaz Carmeli, Orr Paradise, Yonatan Belinkov

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments with emergent communication games validate our theoretical results. These findings demonstrate an inherent advantage of distance-based communication goals, and contextualize previous empirical discoveries. To support our findings, we run experiments on Shapes [27] and MNIST [30].
Researcher Affiliation Academia 1 Technion Israel Institute of Technology 2 UC Berkeley
Pseudocode Yes Algorithm 1 Compute message variance
Open Source Code Yes Our code is available at https://github.com/Rotem-BZ/SemanticConsistency.
Open Datasets Yes To support our theoretical findings, we run experiments on two datasets: (i) the MNIST dataset [30] contains images of a single hand-written digit; (ii) the Shapes dataset [27] contains images of an object with random shape, color and position.
Dataset Splits Yes We generate 10K data samples, split into 9088 (142 batches of 64) training samples and 896 (14 batches of 64) validation samples. The MNIST dataset [30] contains (1, 28, 28) images of hand-written digits, split into 54K training samples and 5440 validation samples.
Hardware Specification Yes The main training stage (EC) takes 4 hours on Shapes, 1.5 hours on MNIST using NVIDIA Ge Force RTX 2080 Ti.
Software Dependencies No The paper mentions Gumbel-Softmax, GRU, MLP, and transposeconvolutions as components, but does not specify version numbers for these or other software libraries (e.g., PyTorch, TensorFlow) or Python.
Experiment Setup Yes We train (during the EC phase) for 200/100 epochs with batch size 64. We use vocabulary size 10 and message length 4. Every message is full length as we do not use a special end-of-sentence token. We use a fixed temperature value of 1.0 for the Gumbel-Softmax sampling method.