Learning to Adaptively Scale Recurrent Neural Networks
Authors: Hao Hu, Liqiang Wang, Guo-Jun Qi3822-3829
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experiments on multiple sequence modeling tasks indicate ASRNNs can efficiently adapt scales based on different sequence contexts and yield better performances than baselines without dynamical scaling abilities. To verify the effectiveness of ASRNNs, we conduct extensive experiments on various sequence modeling tasks, including low density signal identification, long-term memorization, pixel-to-pixel image classification, music genre recognition and language modeling. |
| Researcher Affiliation | Collaboration | Hao Hu,1 Liqiang Wang,1 Guo-Jun Qi2 1University of Central Florida 2Huawei Cloud |
| Pseudocode | No | The paper describes the model and its components using mathematical equations but does not include any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide an explicit statement about open-sourcing their code for the described methodology or a link to a code repository. |
| Open Datasets | Yes | Low Density Signal Type Identification: 'we randomly generate 2000 low density sequences for each type. We choose 1600 sequences per type for training and the remaining are for testing.' Pixel-to-Pixel Image Classification: 'MNIST benchmark (Le Cun et al. 1998).' Music Genre Recognition: 'free music archive (FMA) dataset (Defferrard et al. 2017) to conduct our experiments.' Word Level Language Modeling: 'Wiki Text-2 (Merity et al. 2016) dataset'. |
| Dataset Splits | Yes | Music Genre Recognition: 'We follow the standard 80/10/10% data splitting protocols to get training, validation and test sets.' (For other tasks like MNIST, standard data split settings are mentioned, which imply common splits including validation). |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory) used to run the experiments. |
| Software Dependencies | No | The paper states: 'Unless specified otherwise, all the models are implemented using Tensorflow library (Abadi et al. 2016).' However, it does not specify a version number for TensorFlow or any other software dependencies. |
| Experiment Setup | Yes | We train all the models with the RMSProp optimizer (Tieleman and Hinton ) and set learning rate and decay rate to 0.001 and 0.9, respectively. All the weight matrices are initialized with glorot uniform initialization (Glorot and Bengio 2010). For ASRNNs, we choose Haar wavelet as default wavelet kernels, and set τ of Gumbel-Softmax to 0.1. For both SRNNs and ASRNNs, The maximal considered scale J and wavelet kernel size K are set to 4 and 8, respectively. |