Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Full-Capacity Unitary Recurrent Neural Networks
Authors: Scott Wisdom, Thomas Powers, John Hershey, Jonathan Le Roux, Les Atlas
NeurIPS 2016 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We confirm the utility of our claims by empirically evaluating our new full-capacity u RNNs on both synthetic and natural data, achieving superior performance compared to both LSTMs and the original restricted-capacity u RNNs. |
| Researcher Affiliation | Collaboration | 1 Department of Electrical Engineering, University of Washington EMAIL 2 Mitsubishi Electric Research Laboratories (MERL) EMAIL |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | All code to replicate our results is available from https://github.com/stwisdom/urnn. |
| Open Datasets | Yes | We use the TIMIT dataset [17]. ... For the task of system identification, we consider the problem of learning the dynamics of a nonlinear dynamical system that has the form (1), given a dataset of inputs and outputs of the system. ... pixel-by-pixel MNIST and permuted pixel-by-pixel MNIST |
| Dataset Splits | Yes | For all experiments, the number of training, validation, and test sequences are 20000, 1000, and 1000, respectively. ... According to common practice [18], we use a training set with 3690 utterances from 462 speakers, a validation set of 400 utterances, an evaluation set of 192 utterances. ... We use 5000 of the 60000 training examples as a validation set to perform early stopping with a patience of 5. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU model, CPU type, memory) used for the experiments. It only mentions 'All models are implemented in Theano'. |
| Software Dependencies | No | The paper mentions 'All models are implemented in Theano [16]' but does not provide a specific version number for Theano or any other software libraries, which is necessary for reproducible setup. |
| Experiment Setup | Yes | The learning rate is 0.001 with a batch size of 50 for all experiments. ... The full-capacity u RNN uses a hidden state size of N = 128 with no gradient normalization. To match the number of parameters ( 22k), we use N = 470 for the restricted-capacity u RNN, and N = 68 for the LSTM. ... For the LSTM and restricted-capacity u RNNs, we use RMSprop [15] with a learning rate of 0.001, momentum 0.9, and averaging parameter 0.1. For the full-capacity u RNN, we also use RMSprop to optimize all network parameters, except for the recurrence matrix, for which we use stochastic gradient descent along the Stiefel manifold using the update (6) with a fixed learning rate of 0.001 and no gradient normalization. ... We use 5000 of the 60000 training examples as a validation set to perform early stopping with a patience of 5. The loss function is cross-entropy. |