reproducibilityindex.ai

Exploring Sparsity in Recurrent Neural Networks

Authors: Sharan Narang, Greg Diamos, Shubho Sengupta, Erich Elsen

ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We run all our experiments on a training set of 2100 hours of English speech data and a validation set of 3.5 hours of multi-speaker data.
Researcher Affiliation	Industry	Sharan Narang, Greg Diamos, Shubho Sengupta & Erich Elsen Baidu Research {sharan,gdiamos,ssengupta}@baidu.com Now at Google Brain eriche@google.com
Pseudocode	Yes	Algorithm 1 Pruning Algorithm
Open Source Code	No	The paper does not provide an unambiguous statement or link for the open-source code of their methodology.
Open Datasets	No	The paper states, "We run all our experiments on a training set of 2100 hours of English speech data and a validation set of 3.5 hours of multi-speaker data. This is a small subset of the datasets that we use to train our state-of-the-art automatic speech recognition models.", but does not provide any information about public availability or access.
Dataset Splits	Yes	We run all our experiments on a training set of 2100 hours of English speech data and a validation set of 3.5 hours of multi-speaker data.
Hardware Specification	Yes	The performance benchmark was run using NVIDIA s CUDNN and cu SPARSE libraries on a Titan X Maxwell GPU and compiled using CUDA 7.5.
Software Dependencies	Yes	The performance benchmark was run using NVIDIA s CUDNN and cu SPARSE libraries on a Titan X Maxwell GPU and compiled using CUDA 7.5.
Experiment Setup	Yes	We train the models using Nesterov SGD for 20 epochs. Besides the hyper-parameters for determining the threshold, all other hyper-parameters remain unchanged between the dense and sparse training runs. In the sparse run, the pruning begins shortly after the first epoch and continues until the 10th epoch.