reproducibilityindex.ai

Discrete-Valued Neural Communication

Authors: Dianbo Liu, Alex M. Lamb, Kenji Kawaguchi, Anirudh Goyal ALIAS PARTH GOYAL, Chen Sun, Michael C. Mozer, Yoshua Bengio

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments show that discrete-valued neural communication (DVNC) substantially improves systematic generalization in a variety of architectures transformers, modular architectures, and graph neural networks.
Researcher Affiliation	Collaboration	Dianbo Liu Mila Alex Lamb Mila Kenji Kawaguchi Harvard University Anirudh Goyal Mila Chen Sun Mila Michael C. Mozer Google Research, Brain Team Yoshua Bengio Mila
Pseudocode	Yes	Appendix E presents the pseudocode for RIMs with discretization.
Open Source Code	No	The paper does not contain any explicit statement about releasing source code or a link to a code repository.
Open Datasets	Yes	We adapted and modiﬁed the original 2D shapes and 3D shapes movement tasks from Kipf et al. (2019)... We experimented with the Sort-of-CLEVR visual relational reasoning task... (Santoro et al., 2017)... we consider the task of classifying MNIST digits as sequences of pixels (Krueger et al., 2016).
Dataset Splits	No	The paper mentions 'training data' and 'test set' and 'OOD settings' (e.g., 'ﬁve objects are available in training data, three objects are available in OOD-1 and only two objects are available in OOD-2') but does not provide specific percentages or counts for training/validation/test splits, nor does it reference predefined splits with citations for reproduction.
Hardware Specification	No	The paper does not provide specific hardware details such as exact GPU/CPU models or memory amounts used for running experiments.
Software Dependencies	No	The paper does not specify any software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9, CUDA 11.1).
Experiment Setup	Yes	We picked β = 0.25 as in the original VQ-VAE paper (Oord et al., 2017). We initialized e using k-means clustering on vectors h with k = L and trained the codebook together with other parts of the model by gradient descent.