reproducibilityindex.ai

Unitary Evolution Recurrent Neural Networks

Authors: Martin Arjovsky, Amar Shah, Yoshua Bengio

ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the potential of this architecture by achieving state of the art results in several hard tasks involving very longterm dependencies. and In this section we explore the performance of our u RNN in relation to (a) RNN with tanh activations, (b) IRNN (Le et al., 2015), that is an RNN with Re LU activations and with the recurrent weight matrix initialized to the identity, and (c) LSTM (Hochreiter & Schmidhuber, 1997) models.
Researcher Affiliation	Academia	Martin Arjovsky MARJOVSKY@DC.UBA.AR Amar Shah AS793@CAM.AC.UK Yoshua Bengio Universidad de Buenos Aires, University of Cambridge, Universit e de Montr eal. Yoshua Bengio is a CIFAR Senior Fellow.
Pseudocode	No	The paper describes the architecture and mathematical operations but does not include structured pseudocode or algorithm blocks.
Open Source Code	Yes	This along with other implementation details are discussed in Section 4, and the code used for the experiments is available online.
Open Datasets	Yes	We chose a handful of tasks to evaluate the performance of the various models. The tasks were especially created to be be pathologically hard, and have been used as benchmarks for testing the ability of a model to capture long-term memory (Hochreiter & Schmidhuber, 1997; Le et al., 2015; Graves et al., 2014; Martens & Sutskever, 2011) and Pixel-by-pixel MNIST from (Le Cun et al., 1998).
Dataset Splits	No	The paper discusses training and testing performance but does not provide specific details on validation dataset splits, percentages, or cross-validation methodology.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU/GPU models, memory specifications) used for running its experiments. It mentions 'GPU memory' generally but not specific hardware used.
Software Dependencies	No	The paper mentions using 'Theano' but does not specify its version number or any other software dependencies with their respective versions, which is necessary for reproducible ancillary software details.
Experiment Setup	Yes	In each experiment we use a learning rate of 10 3 and a decay rate of 0.9. For the LSTM and RNN models, we had to clip gradients at 1 to avoid exploding gradients. and We initialize V and U (the input and output matrices) as in (Glorot & Bengio, 2010), with weights sampled independently from uniforms, U h 6 nin+nout , and The biases, b and bo are initialized to 0. and The diagonal weights for D1, D2 and D3 are sampled from a uniform, U[ π, π].