One-Shot Pruning of Recurrent Neural Networks by Jacobian Spectrum Evaluation
Authors: Shunshi Zhang, Bradly C. Stadie
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate on sequential MNIST, Billion Words, and Wikitext. Our method is evaluated with a GRU network on sequential MNIST, Wikitext, and Billion Words. |
| Researcher Affiliation | Academia | Matthew Shunshi Zhang University of Toronto matthew.zhang@mail.utoronto.ca Bradly C. Stadie Vector Institute |
| Pseudocode | Yes | Algorithm 1 Pruning Recurrent Networks |
| Open Source Code | No | The paper does not contain any statement about releasing source code for the described methodology or a link to a code repository. |
| Open Datasets | Yes | We evaluate on sequential MNIST, Billion Words, and Wikitext. (Lee et al., 2018) (Chelba et al., 2013) |
| Dataset Splits | Yes | We report the training and validation perplexities on a random 1% sample of the training set in Table 4. Table 1: Validation Error % of Various 400 Unit RNN Architectures after 50 Epochs of Training on Seq. MNIST |
| Hardware Specification | Yes | We trained all networks with a single Nvidia P100 GPU. |
| Software Dependencies | No | The paper mentions the use of the Adam optimizer and Glorot initialization, but does not specify version numbers for any programming languages or software libraries used in the experiments. |
| Experiment Setup | Yes | We use a minibatch size of 64 samples during training, and optimize using the Ada M optimizer (Kingma & Ba, 2014) with a learning rate of 1e-3. We use an initial hidden state of zeros for all experiments. |