reproducibilityindex.ai

Cortical microcircuits as gated-recurrent neural networks

Authors: Rui Costa, Ioannis Alexandros Assael, Brendan Shillingford, Nando de Freitas, TIm Vogels

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our empirical evaluation across sequential image classiﬁcation and language modelling tasks shows that sub LSTM units can achieve similar performance to LSTM units.
Researcher Affiliation	Collaboration	Rui Ponte Costa Centre for Neural Circuits and Behaviour Dept. of Physiology, Anatomy and Genetics University of Oxford, Oxford, UK rui.costa@cncb.ox.ac.uk Yannis M. Assael Dept. of Computer Science University of Oxford, Oxford, UK and Deep Mind, London, UK yannis.assael@cs.ox.ac.uk Brendan Shillingford Dept. of Computer Science University of Oxford, Oxford, UK and Deep Mind, London, UK brendan.shillingford@cs.ox.ac.uk Nando de Freitas Deep Mind London, UK nandodefreitas@google.com Tim P. Vogels Centre for Neural Circuits and Behaviour Dept. of Physiology, Anatomy and Genetics University of Oxford, Oxford, UK tim.vogels@cncb.ox.ac.uk
Pseudocode	No	The paper presents mathematical equations for the LSTM and sub LSTM models and their derivatives, along with diagrams, but it does not include a distinct section or block labeled as 'Pseudocode' or 'Algorithm'.
Open Source Code	No	The paper does not contain any explicit statement about releasing source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets	Yes	In the sequential MNIST digit classiﬁcation task, each digit image from the MNIST dataset is presented to the RNN as a sequence of pixels (Le et al. (2015); Fig. 2a). We ﬁrst used the Penn Treebank (PTB) dataset to train our model on word-level language modelling (929k training, 73k validation and 82k test words; with a vocabulary of 10k words). We also tested the Wikitext-2 language modelling dataset based on Wikipedia articles.
Dataset Splits	Yes	We ﬁrst used the Penn Treebank (PTB) dataset to train our model on word-level language modelling (929k training, 73k validation and 82k test words; with a vocabulary of 10k words). This dataset is twice as large as the PTB dataset (2000k training, 217k validation and 245k test words) and also features a larger vocabulary (33k words).
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU/CPU models, memory, or cloud instance types used for running its experiments.
Software Dependencies	No	The paper mentions using 'RMSProp with momentum' and 'Google Vizier' for optimization, but it does not specify version numbers for any software dependencies, libraries, or programming languages used in the implementation of the models.
Experiment Setup	Yes	The network was optimised using RMSProp with momentum (Tieleman and Hinton, 2012), a learning rate of 10 4, one hidden layer and 100 hidden units. All RNNs tested have 2 hidden layers; backpropagation is truncated to 35 steps, and a batch size of 20. To optimise the networks we used RMSProp with momentum. We also performed a hyperparameter search on the validation set over input, output, and update dropout rates, the learning rate, and weight decay. The hyperparameter search was done with Google Vizier, which performs black-box optimisation using Gaussian process bandits and transfer learning. Tables 2 and 3 show the resulting hyperparameters.