reproducibilityindex.ai

Sparse Communication via Mixed Distributions

Authors: António Farinhas, Wilker Aziz, Vlad Niculae, Andre Martins

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We experiment with both approaches on an emergent communication benchmark and on modeling MNIST and Fashion-MNIST data with variational auto-encoders with mixed latent variables.
Researcher Affiliation	Collaboration	António Farinhas 1, Wilker Aziz 2, Vlad Niculae 3, André F. T. Martins 1,4 1Instituto de Telecomunicações, Instituto Superior Técnico (Lisbon ELLIS Unit), 2ILLC, University of Amsterdam, 3Iv I, University of Amsterdam, 4Unbabel
Pseudocode	No	The paper describes algorithms in text (e.g., 'forward algorithm', 'backward algorithm') but does not present them as structured pseudocode or labeled algorithm blocks.
Open Source Code	Yes	Our code is publicly available. Additionally, code and instructions to reproduce our experiments are available at https://github.com/deep-spin/ sparse-communication.
Open Datasets	Yes	Data. The dataset consists of a subset of Image Net (Deng et al., 2009)...To get the dataset visit https://github.com/Diane Bouchacourt/ Signaling Game (Bouchacourt & Baroni, 2018). We use Fashion-MNIST (Xiao et al., 2017)...We use stochastically binarized MNIST (Le Cun et al., 2010).
Dataset Splits	Yes	The ﬁrst 55,000 instances are used for training, the next 5,000 instances for development and the remaining 10,000 for test.
Hardware Specification	Yes	Our infrastructure consists of 5 machines with the speciﬁcations shown in Table 5. Table 5: Computing infrastructure. 1. 4 Titan Xp 12GB 16 AMD Ryzen 1950X @ 3.40GHz 128GB 2. 4 GTX 1080 Ti 12GB 8 Intel i7-9800X @ 3.80GHz 128GB 3. 3 RTX 2080 Ti 12GB 12 AMD Ryzen 2920X @ 3.50GHz 128GB 4. 3 RTX 2080 Ti 12GB 12 AMD Ryzen 2920X @ 3.50GHz 128GB 5. 2 GTX Titan X 12GB 12 Intel Xeon E5-1650 v3 @ 3.50GHz 64 GB
Software Dependencies	Yes	This work was built on open-source software; we acknowledge Van Rossum & Drake (2009); Oliphant (2006); Virtanen et al. (2020); Walt et al. (2011); Pedregosa et al. (2011), and Paszke et al. (2019).
Experiment Setup	Yes	We choose the best hyperparameter conﬁguration by doing a grid search on the learning rate (0.01, 0.005, 0.001)...temperature is annealed using the schedule τ = max(0.5, exp rt)...For the K-D Hard Concrete we use a scaling constant λ = 1.1 and for Gaussian Sparsemax we set Σ = I. All models were trained for 500 epochs using the Adam optimizer with a batch size of 64.