reproducibilityindex.ai

Strongly-Typed Recurrent Neural Networks

Authors: David Balduzzi, Muhammad Ghifary

ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments in section 4 show that, despite being more constrained, strongly-typed architectures achieve lower training and comparable generalization error to classical architectures.
Researcher Affiliation	Collaboration	1Victoria University of Wellington, New Zealand 2Weta Digital, New Zealand
Pseudocode	No	The paper does not contain explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any explicit statements about releasing source code or links to a code repository for the described methodology.
Open Datasets	Yes	We used Leo Tolstoy s War and Peace (WP) which consists of 3,258,246 characters of English text, split into train/val/test sets with 80/10/10 ratios. We used the Penn Treebank (PTB) dataset (Marcus et al., 1993), which consists of 929K training words, 73K validation words, and 82K test words, with vocabulary size of 10K words. The PTB dataset is publicly available on web.3 http://www.fit.vutbr.cz/ imikolov/rnnlm/ simple-examples.tgz
Dataset Splits	Yes	We used Leo Tolstoy s War and Peace (WP) which consists of 3,258,246 characters of English text, split into train/val/test sets with 80/10/10 ratios. We used the Penn Treebank (PTB) dataset (Marcus et al., 1993), which consists of 929K training words, 73K validation words, and 82K test words
Hardware Specification	Yes	Training on the PTB dataset on an NVIDIA GTX 980 GPU
Software Dependencies	No	The paper does not specify any software dependencies with version numbers (e.g., specific deep learning frameworks like TensorFlow or PyTorch, or programming language versions like Python 3.x).
Experiment Setup	Yes	Results are reported for two conﬁgurations: 64 and 256 , which are models with the same number of parameters as a 1-layer LSTM with 64 and 256 cells per layer respectively. Dropout regularization was only applied to the 256 models. The dropout rate was taken from {0.1, 0.2} based on validation performance. For the medium models, we selected the dropout rate from {0.4, 0.5, 0.6} according to validation performance.