Relational recurrent neural networks
Authors: Adam Santoro, Ryan Faulkner, David Raposo, Jack Rae, Mike Chrzanowski, Theophane Weber, Daan Wierstra, Oriol Vinyals, Razvan Pascanu, Timothy Lillicrap
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We then apply the RMC to a suite of tasks that may profit from more explicit memory-memory interactions, and hence, a potentially increased capacity for relational reasoning across time: partially observed reinforcement learning tasks, program evaluation, and language modeling on the Wikitext-103, Project Gutenberg, and Giga Word datasets. ... achieving state-of-the-art results on the Wiki Text-103, Project Gutenberg, and Giga Word datasets. |
| Researcher Affiliation | Collaboration | αDeep Mind London, United Kingdom βCo MPLEX, Computer Science, University College London London, United Kingdom |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. It provides mathematical equations for the LSTM embedding (3-9) and flow diagrams. |
| Open Source Code | Yes | please see the url in the footnote for our Tensorflow implementation of this model in the Sonnet library 2, and for the exact formulation we used, including our choice for the gψ function (briefly, we found a row/memory-wise MLP with layer normalisation to work best). 2https://github.com/deepmind/sonnet/blob/master/sonnet/python/modules/ relational_memory.py |
| Open Datasets | Yes | Wiki Text-103 satisfies this set of requirements as it consists of Wikipedia articles shuffled at the article level with roughly 100M training tokens, as do two stylistically different sources of text data: books from Project Gutenberg3 and news articles from Giga Word v5 [36]. 3Project Gutenberg. (n.d.). Retrieved January 2, 2018, from www.gutenberg.org [36] Robert Parker, David Graff, Junbo Kong, Ke Chen, and Kazuaki Maeda. English gigaword fifth edition ldc2011t07. dvd. Philadelphia: Linguistic Data Consortium, 2011. |
| Dataset Splits | No | The paper mentions training token counts for datasets (e.g., 'roughly 100M training tokens' for Wiki Text-103) and reports validation/test perplexities, but it does not specify the exact split percentages or sample counts for training, validation, and test sets to reproduce the data partitioning. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU types, or memory amounts used for running the experiments. It only mentions the software framework used (TensorFlow). |
| Software Dependencies | No | The paper mentions using 'Tensorflow implementation of this model in the Sonnet library' but does not provide specific version numbers for TensorFlow, Sonnet, or any other software dependencies. |
| Experiment Setup | Yes | Here we briefly outline the tasks on which we applied the RMC, and direct the reader to the appendix for full details on each task and details on hyperparameter settings for the model. In the appendix we list the exact configurations for each task. |