reproducibilityindex.ai

MomentumRNN: Integrating Momentum into Recurrent Neural Networks

Authors: Tan Nguyen, Richard Baraniuk, Andrea Bertozzi, Stanley Osher, Bao Wang

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we evaluate the effectiveness of our momentum approach in designing RNNs in terms of convergence speed and accuracy. We compare the performance of the Momentum LSTM with the baseline LSTM [24] in the following tasks: 1) the object classiﬁcation task on pixel-permuted MNIST [32], 2) the speech prediction task on the TIMIT dataset [1, 22, 62, 38, 23], 3) the celebrated copying and adding tasks [24, 1], and 4) the language modeling task on the Penn Tree Bank (PTB) dataset [39].
Researcher Affiliation	Academia	Tan M. Nguyen Department of ECE Rice University, Houston, USA, Richard G. Baraniuk Department of ECE Rice University, Houston, USA, Andrea L. Bertozzi Department of Mathematics University of California, Los Angeles, Stanley J. Osher Department of Mathematics University of California, Los Angeles, Bao Wang Department of Mathematics Scientiﬁc Computing and Imaging (SCI) Institute University of Utah, Salt Lake City, UT, USA
Pseudocode	No	The paper includes architectural illustrations and mathematical equations, but no structured pseudocode or algorithm blocks.
Open Source Code	No	The paper mentions using baseline codebases provided by [5] and [54] for experiments, but it does not provide concrete access to its own open-source code for the methodology described in the paper.
Open Datasets	Yes	We compare the performance of the Momentum LSTM with the baseline LSTM [24] in the following tasks: 1) the object classiﬁcation task on pixel-permuted MNIST [32], 2) the speech prediction task on the TIMIT dataset [1, 22, 62, 38, 23], 3) the celebrated copying and adding tasks [24, 1], and 4) the language modeling task on the Penn Tree Bank (PTB) dataset [39].
Dataset Splits	Yes	We use the standard train/validation/test separation in [62, 34, 6], thereby having 3640 utterances for the training set with a validation set of size 192 and a test set of size 400. Results are reported on the test set using the model parameters that yield the best validation loss.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper mentions using 'Pytorch' from a cited reference [48] and refers to external codebases for baselines, but it does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment.
Experiment Setup	Yes	We include details on the models, datasets, training procedure, and hyperparameters used in our experiments in Appendix A.