Temporal Pyramid Recurrent Neural Network

Authors: Qianli Ma, Zhenxi Lin, Enhuan Chen, Garrison Cottrell5061-5068

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate TP-RNN on several sequence modeling tasks, including the masked addition problem, pixelby-pixel image classification, signal recognition and speaker identification. Experimental results demonstrate that TP-RNN consistently outperforms existing RNNs for learning long-term and multi-scale dependencies in sequential data.
Researcher Affiliation Academia 1School of Computer Science and Engineering, South China University of Technology, Guangzhou, China 2Department of Computer Science and Engineering, University of California, San Diego, CA, USA
Pseudocode No The paper describes the model with mathematical equations and figures, but does not provide a formal pseudocode or algorithm block.
Open Source Code Yes The Supplementary material is publicly available at https://github.com/qianlima-lab/TPRNN
Open Datasets Yes We evaluate TP-RNN on two pixel-by-pixel image classification data sets, MNIST and p MNIST (Le, Jaitly, and Hinton 2015). ... We also evaluate TP-RNN on a speaker identification task using the English multi-speaker corpus from CSTR voice cloning toolkit (VCTK) (Yamagishi 2012).
Dataset Splits Yes It contains 9000 sequences of length 1000, which are divided into a training set, a validation set and a test set according to the ratio of 7:1:2.
Hardware Specification No The paper does not specify any hardware details such as GPU or CPU models used for running the experiments.
Software Dependencies No The paper mentions using LSTM as the basic RNN unit and the Adam optimizer, but does not provide specific version numbers for any software dependencies or frameworks.
Experiment Setup Yes The number of layers for LSTM-M, HM-LSTM and TP-RNN-M are both set to 3, and the hidden sizes for LSTMs, HM-LSTM and TP-RNNs are both set to 100. For Dilated LSTMs, the number of layers is set to 9 as in (Chang et al. 2017), while the hidden size is set to 59 for a comparable number of parameters with LSTM-M, HM-LSTM and TP-RNN-M (this setting yields better results than those in (Chang et al. 2017)). All models are trained with the Adam optimizer and the learning rate and decay rate are set to 1e-3 and 0.9, respectively.