reproducibilityindex.ai

Dilated Recurrent Neural Networks

Authors: Shiyu Chang, Yang Zhang, Wei Han, Mo Yu, Xiaoxiao Guo, Wei Tan, Xiaodong Cui, Michael Witbrock, Mark A. Hasegawa-Johnson, Thomas S. Huang

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically validate the DILATEDRNN in multiple RNN settings on a variety of sequential learning tasks, including long-term memorization, pixel-by-pixel classiﬁcation of handwritten digits (with permutation and noise), character-level language modeling, and speaker identiﬁcation with raw audio waveforms.
Researcher Affiliation	Collaboration	1IBM Thomas J. Watson Research Center, Yorktown, NY 10598, USA 2University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
Pseudocode	No	The paper describes the architecture and processes using mathematical equations and textual explanations, but it does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	The code for our method is publicly available1. 1https://github.com/code-terminator/Dilated RNN
Open Datasets	Yes	We empirically validate the DILATEDRNN in multiple RNN settings on a variety of sequential learning tasks, including long-term memorization, pixel-by-pixel classiﬁcation of handwritten digits (with permutation and noise), character-level language modeling on the Penn Treebank [16], and speaker identiﬁcation with raw audio waveforms on VCTK [26].
Dataset Splits	Yes	Training, validation and testing sets are the default ones in Tensorﬂow. Hyperparameters and results are reported in table 1. [...] Results are reported for trained models that achieve the best validation loss.
Hardware Specification	Yes	Notably, the model with dilation starting at 64 is able to train within 17 minutes by using a single Nvidia P-100 GPU while maintaining a 93.5% test accuracy.
Software Dependencies	No	Unless speciﬁed otherwise, all the models are implemented with Tensorﬂow [1]. No specific version number for TensorFlow or any other software dependency is provided.
Experiment Setup	Yes	Unless speciﬁed otherwise, all the models are implemented with Tensorﬂow [1]. We use the default nonlinearities and RMSProp optimizer [21] with learning rate 0.001 and decay rate of 0.9. All weight matrices are initialized by the standard normal distribution. The batch size is set to 128. Furthermore, in all the experiments, we apply the sequence classiﬁcation setting [25], where the output layer only adds at the end of the sequence. Results are reported for trained models that achieve the best validation loss.