reproducibilityindex.ai

Lipschitz Recurrent Neural Networks

Authors: N. Benjamin Erichson, Omri Azencot, Alejandro Queiruga, Liam Hodgkinson, Michael W. Mahoney

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments demonstrate that the Lipschitz RNN can outperform existing recurrent units on a range of benchmark tasks, including computer vision, language modeling and speech prediction tasks.
Researcher Affiliation	Collaboration	N. Benjamin Erichson ICSI and UC Berkeley erichson@berkeley.edu Omri Azencot Ben-Gurion University azencot@cs.bgu.ac.il Alejandro Queiruga Google Research afq@google.com Liam Hodgkinson ICSI and UC Berkeley liam.hodgkinson@berkeley.edu Michael W. Mahoney ICSI and UC Berkeley mmahoney@stat.berkeley.edu
Pseudocode	No	The paper describes the proposed model and methods using mathematical equations and textual descriptions, but it does not include any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Research code is shared via github.com/erichson/Lipschitz RNN.
Open Datasets	Yes	The model is applied to ordered and permuted pixel-by-pixel MNIST classiﬁcation, as well as to audio data using the TIMIT dataset. ... Next, we consider the TIMIT dataset (Garofolo, 1993)... ... Penn Tree Bank (PTB) (Marcus et al., 1993).
Dataset Splits	Yes	To compare our results with those of other models, we used the common train / validation / test split: 3690 utterances from 462 speakers for training, 192 utterances for validation, and 400 utterances for testing. ... The dataset is composed of a train / validation / test set, where 5017K characters are used for training, 393K characters are used for validation and 442K characters are used for testing.
Hardware Specification	No	The paper mentions support from 'Amazon AWS and Google Cloud' but does not provide specific hardware details such as GPU/CPU models, memory, or processor types used for running experiments.
Software Dependencies	No	The paper mentions using 'Py Hessian (Yao et al., 2019)' but does not provide specific version numbers for this or any other software libraries or dependencies used for the experiments.
Experiment Setup	Yes	For tuning we utilized a standard training procedure using a non-exhaustive random search within the following plausible ranges for the our weight parameterization β = 0.65, 0.7, 0.75, 0.8, γ = [0.001, 1.0]. For Adam we explored learning rates between 0.001 and 0.005, and for SGD we considered 0.1. For the step size we explored values in the range 0.001 to 1.0. ... Table 8: Tuning parameters used for our experimental results and the performance evaluated with 12 different seed values for the parameter initialization of the model.