reproducibilityindex.ai

Emergent Communication: Generalization and Overfitting in Lewis Games

Authors: Mathieu Rita, Corentin Tallec, Paul Michel, Jean-Bastien Grill, Olivier Pietquin, Emmanuel Dupoux, Florian Strub

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Based on this decomposition, we empirically examine the evolution of these two losses during the learning process (Section 5). Unless specified, all our experiments are run on the reconstruction game defined in Section 2.1.
Researcher Affiliation	Collaboration	Mathieu Rita INRIA, Paris mathieu.rita@inria.fr Corentin Tallec Paul Michel Jean-Bastien Grill Deep Mind [corentint,paulmiche,jbgrill]@deepmind.com Olivier Pietquin Google Research, Brain Team pietquin@google.com Emmanuel Dupoux EHESS,ENS-PSL,CNRS,INRIA Meta AI Research emmanuel.dupoux@gmail.com Florian Strub Deep Mind fstrub@deepmind.com
Pseudocode	No	The paper does not contain any pseudocode or algorithm blocks.
Open Source Code	Yes	Our implementation is based on the EGG toolkit [39] and the code is available at https: //github.com/Mathieu Rita/Population.
Open Datasets	Yes	We thus train our agents on a discriminative game on top of the Celeb A [57] and Image Net [69, 19] datasets while applying previous protocol. Training, validation and test sets are randomly drawn from this pool of objects (uniformly and without overlap), and are respectively composed of 4000, 1000 and 1000 elements.
Dataset Splits	Yes	Training, validation and test sets are randomly drawn from this pool of objects (uniformly and without overlap), and are respectively composed of 4000, 1000 and 1000 elements.
Hardware Specification	Yes	Each experiment runs on a single V100-32G GPU
Software Dependencies	No	Our models are implemented in PyTorch [64] and are optimized using Adam [42]. No specific version numbers for software are provided.
Experiment Setup	Yes	The agents are optimized using Adam [42] with a learning rate of 5 10 4, β1 = 0.9 and β2 = 0.999 and a batch size of 1024. For the speaker we use policy gradient [76], with a baseline computed as the average reward within the minibatch, and an entropy regularization of 0.01 to the speaker s loss [82]. In all experiments, we select the best models by early stopping.