Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Simplified State Space Layers for Sequence Modeling
Authors: Jimmy T.H. Smith, Andrew Warrington, Scott Linderman
ICLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We now compare empirically the performance of the S5 layer to the S4 layer and other baseline methods. |
| Researcher Affiliation | Academia | 1Institute for Computational and Mathematical Engineering, Stanford University. 2Wu Tsai Neurosciences Institute, Stanford University. 3Department of Statistics, Stanford University. |
| Pseudocode | Yes | Listing 1: JAX implementation to apply a single S5 layer to a batch of input sequences. |
| Open Source Code | Yes | The full S5 implementation is available at: https://github.com/lindermanlab/S5. |
| Open Datasets | Yes | The long range arena (LRA) benchmark (Tay et al., 2021) is a suite of six sequence modeling tasks... |
| Dataset Splits | Yes | There are 96,000 training sequences, 2,000 validation sequences, and 2,000 test sequences. |
| Hardware Specification | Yes | All comparisons were made using a 16GB NVIDIA V100 GPU. |
| Software Dependencies | No | The paper mentions using JAX but does not provide specific version numbers for JAX or other software libraries. |
| Experiment Setup | Yes | Table 11 presents the main hyperparameters used for each experiment. Depth: number of layers. H: number of input/output features. P: Latent size. J: number of blocks used for the initialization of A (see Section B.1.1). Dropout: dropout rate. LR: global learning rate. SSM LR: the SSM learning rate. B: batch size. Epochs: max epochs set for the run. WD: weight decay. |