Relational recurrent neural networks

Authors: Adam Santoro, Ryan Faulkner, David Raposo, Jack Rae, Mike Chrzanowski, Theophane Weber, Daan Wierstra, Oriol Vinyals, Razvan Pascanu, Timothy Lillicrap

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We then apply the RMC to a suite of tasks that may profit from more explicit memory-memory interactions, and hence, a potentially increased capacity for relational reasoning across time: partially observed reinforcement learning tasks, program evaluation, and language modeling on the Wikitext-103, Project Gutenberg, and Giga Word datasets. ... achieving state-of-the-art results on the Wiki Text-103, Project Gutenberg, and Giga Word datasets.
Researcher Affiliation Collaboration αDeep Mind London, United Kingdom βCo MPLEX, Computer Science, University College London London, United Kingdom
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks. It provides mathematical equations for the LSTM embedding (3-9) and flow diagrams.
Open Source Code Yes please see the url in the footnote for our Tensorflow implementation of this model in the Sonnet library 2, and for the exact formulation we used, including our choice for the gψ function (briefly, we found a row/memory-wise MLP with layer normalisation to work best). 2https://github.com/deepmind/sonnet/blob/master/sonnet/python/modules/ relational_memory.py
Open Datasets Yes Wiki Text-103 satisfies this set of requirements as it consists of Wikipedia articles shuffled at the article level with roughly 100M training tokens, as do two stylistically different sources of text data: books from Project Gutenberg3 and news articles from Giga Word v5 [36]. 3Project Gutenberg. (n.d.). Retrieved January 2, 2018, from www.gutenberg.org [36] Robert Parker, David Graff, Junbo Kong, Ke Chen, and Kazuaki Maeda. English gigaword fifth edition ldc2011t07. dvd. Philadelphia: Linguistic Data Consortium, 2011.
Dataset Splits No The paper mentions training token counts for datasets (e.g., 'roughly 100M training tokens' for Wiki Text-103) and reports validation/test perplexities, but it does not specify the exact split percentages or sample counts for training, validation, and test sets to reproduce the data partitioning.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or memory amounts used for running the experiments. It only mentions the software framework used (TensorFlow).
Software Dependencies No The paper mentions using 'Tensorflow implementation of this model in the Sonnet library' but does not provide specific version numbers for TensorFlow, Sonnet, or any other software dependencies.
Experiment Setup Yes Here we briefly outline the tasks on which we applied the RMC, and direct the reader to the appendix for full details on each task and details on hyperparameter settings for the model. In the appendix we list the exact configurations for each task.