reproducibilityindex.ai

The Statistical Recurrent Unit

Authors: Junier B. Oliva, Barnabás Póczos, Jeff Schneider

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show the efﬁcacy of SRUs as compared to LSTMs and GRUs in an unbiased manner by optimizing respective architectures hyperparameters for both synthetic and real-world tasks.
Researcher Affiliation	Academia	1Machine Learning Department, Carnegie Mellon University, Pittsburgh, PA, USA. Correspondence to: Junier B. Oliva <joliva@cs.cmu.edu>.
Pseudocode	No	The paper provides update equations and a graphical representation (Figure 1) but does not include structured pseudocode or an algorithm block.
Open Source Code	Yes	See https://github.com/junieroliva/ recurrent for code.
Open Datasets	Yes	Next we explore the ability of recurrent units to use long-term dependencies in ones data with a synthetic task using a real dataset. It has been observed that LSTMs perform poorly in classifying a long pixel-by-pixel sequence of MNIST digits (Le et al., 2015).
Dataset Splits	Yes	We generate a total of 176 points per sequence for 3200 training sequences, 400 validation sequences, and 400 testing sequences.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running experiments.
Software Dependencies	No	All experiments were performed in Tensorflow (Abadi et al., 2016) and used the standard implementations of GRUCell and Basic LSTMCell for GRUs and LSTMs respectively.
Experiment Setup	Yes	In all experiments we used SGD for optimization using gradient clipping (Pascanu et al., 2013) with a norm of 1 on all algorithms. Unless otherwise speciﬁed 100 trials were performed to search over the following hyper-parameters on a validation set: one, initial learning rate the initial learning rate used for SGD, in range of [exp( 10), 1]; two, lr decay the multiplier to multiply the learning rate by every 1k iterations, in range of [0.8, 0.999]; three, dropout keep rate, percent of output units that are kept during dropout, in range (0, 1]; four, num units number of units for recurrent unit, in {1, . . . , 256}.