Selfish Sparse RNN Training
Authors: Shiwei Liu, Decebal Constantin Mocanu, Yulong Pei, Mykola Pechenizkiy
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Using these strategies, we achieve state-of-the-art sparse training results, better than the dense-to-sparse methods, with various types of RNNs on Penn Tree Bank and Wikitext-2 datasets. |
| Researcher Affiliation | Academia | 1Department of Computer Science, Eindhoven University of Technology, the Netherlands 2Faculty of Electrical Engineering, Mathematics, and Computer Science at University of Twente, the Netherlands |
| Pseudocode | Yes | The pseudocode of the full training procedure of our algorithm is shown in Algorithm 1. |
| Open Source Code | Yes | Our codes are available at https://github.com/ Shiweiliuiiiiiii/Selfish-RNN. |
| Open Datasets | Yes | Penn Tree Bank dataset (Marcus et al., 1993) and AWD-LSTM-Mo S on Wiki Text-2 dataset (Melis et al., 2018). |
| Dataset Splits | Yes | Single model perplexity on validation and test sets for the Penn Treebank language modeling task with stacked LSTMs and RHNs. |
| Hardware Specification | No | The paper does not provide specific hardware details (such as GPU/CPU models, memory, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions optimizers like 'Adam (Kingma & Ba, 2014)' but does not provide specific software dependency names with version numbers for libraries, frameworks, or environments used. |
| Experiment Setup | Yes | For fair comparison, we use the exact same hyperparameters and regularization introduced in ON-LSTM (Shen et al., 2019) and AWD-LSTM-Mo S (Yang et al., 2018). See Appendix A for hyperparameters. |