Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Long Expressive Memory for Sequence Modeling
Authors: T. Konstantin Rusch, Siddhartha Mishra, N. Benjamin Erichson, Michael W. Mahoney
ICLR 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our empirical results, ranging from image and time-series classification through dynamical systems prediction to keyword spotting and language modeling, demonstrate that LEM outperforms state-of-the-art recurrent neural networks, gated recurrent units, and long short-term memory models. We provide an extensive empirical evaluation of LEM on a wide variey of data sets, including image and sequence classification, dynamical systems prediction, keyword spotting, and language modeling, thereby demonstrating that LEM outperforms or is comparable to state-of-the-art RNNs, GRUs and LSTMs in each task (Section 5). |
| Researcher Affiliation | Academia | T. Konstantin Rusch ETH Z urich EMAIL Siddhartha Mishra ETH Z urich EMAIL N. Benjamin Erichson University of Pittsburgh EMAIL Michael W. Mahoney ICSI and UC Berkeley EMAIL |
| Pseudocode | No | The paper presents mathematical equations and formulas but no clearly labeled "Pseudocode" or "Algorithm" block. |
| Open Source Code | Yes | All code to reproduce our results can be found at https://github.com/tk-rusch/LEM. |
| Open Datasets | Yes | We consider three experiments based on two widely-used image recognition data sets, i.e., MNIST (Le Cun et al., 1998) and CIFAR-10 (Krizhevsky et al., 2009)... The Google Speech Commands data set V2 (Warden, 2018)... Penn Treebank (PTB) corpus (Marcus et al., 1993), preprocessed by Mikolov et al. (2010). |
| Dataset Splits | Yes | Following Morrill et al. (2021) and Rusch & Mishra (2021b), we divide the data into a train, validation and test set according to a 70%, 15%, 15% ratio. |
| Hardware Specification | Yes | All experiments were run on CPU, namely Intel Xeon Gold 5118 and AMD EPYC 7H12, except for Google12, PTB character-level and PTB word-level, which were run on a Ge Force RTX 2080 Ti GPU. |
| Software Dependencies | No | The paper mentions "language modelling code: https://github.com/deepmind/lamb" but does not list specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9). |
| Experiment Setup | Yes | Details of the training procedure for each experiment can be found in SM A. The hyperparameters are selected based on a random search algorithm, where we present the rounded hyperparameters for the best performing LEM model (based on a validation set) on each task in Table 8. |