reproducibilityindex.ai

Full-Capacity Unitary Recurrent Neural Networks

Authors: Scott Wisdom, Thomas Powers, John Hershey, Jonathan Le Roux, Les Atlas

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conﬁrm the utility of our claims by empirically evaluating our new full-capacity u RNNs on both synthetic and natural data, achieving superior performance compared to both LSTMs and the original restricted-capacity u RNNs.
Researcher Affiliation	Collaboration	1 Department of Electrical Engineering, University of Washington {swisdom, tcpowers, atlas}@uw.edu 2 Mitsubishi Electric Research Laboratories (MERL) {hershey, leroux}@merl.com
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	All code to replicate our results is available from https://github.com/stwisdom/urnn.
Open Datasets	Yes	We use the TIMIT dataset [17]. ... For the task of system identiﬁcation, we consider the problem of learning the dynamics of a nonlinear dynamical system that has the form (1), given a dataset of inputs and outputs of the system. ... pixel-by-pixel MNIST and permuted pixel-by-pixel MNIST
Dataset Splits	Yes	For all experiments, the number of training, validation, and test sequences are 20000, 1000, and 1000, respectively. ... According to common practice [18], we use a training set with 3690 utterances from 462 speakers, a validation set of 400 utterances, an evaluation set of 192 utterances. ... We use 5000 of the 60000 training examples as a validation set to perform early stopping with a patience of 5.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU model, CPU type, memory) used for the experiments. It only mentions 'All models are implemented in Theano'.
Software Dependencies	No	The paper mentions 'All models are implemented in Theano [16]' but does not provide a specific version number for Theano or any other software libraries, which is necessary for reproducible setup.
Experiment Setup	Yes	The learning rate is 0.001 with a batch size of 50 for all experiments. ... The full-capacity u RNN uses a hidden state size of N = 128 with no gradient normalization. To match the number of parameters ( 22k), we use N = 470 for the restricted-capacity u RNN, and N = 68 for the LSTM. ... For the LSTM and restricted-capacity u RNNs, we use RMSprop [15] with a learning rate of 0.001, momentum 0.9, and averaging parameter 0.1. For the full-capacity u RNN, we also use RMSprop to optimize all network parameters, except for the recurrence matrix, for which we use stochastic gradient descent along the Stiefel manifold using the update (6) with a ﬁxed learning rate of 0.001 and no gradient normalization. ... We use 5000 of the 60000 training examples as a validation set to perform early stopping with a patience of 5. The loss function is cross-entropy.