reproducibilityindex.ai

Simplified State Space Layers for Sequence Modeling

Authors: Jimmy T.H. Smith, Andrew Warrington, Scott Linderman

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We now compare empirically the performance of the S5 layer to the S4 layer and other baseline methods.
Researcher Affiliation	Academia	1Institute for Computational and Mathematical Engineering, Stanford University. 2Wu Tsai Neurosciences Institute, Stanford University. 3Department of Statistics, Stanford University.
Pseudocode	Yes	Listing 1: JAX implementation to apply a single S5 layer to a batch of input sequences.
Open Source Code	Yes	The full S5 implementation is available at: https://github.com/lindermanlab/S5.
Open Datasets	Yes	The long range arena (LRA) benchmark (Tay et al., 2021) is a suite of six sequence modeling tasks...
Dataset Splits	Yes	There are 96,000 training sequences, 2,000 validation sequences, and 2,000 test sequences.
Hardware Specification	Yes	All comparisons were made using a 16GB NVIDIA V100 GPU.
Software Dependencies	No	The paper mentions using JAX but does not provide specific version numbers for JAX or other software libraries.
Experiment Setup	Yes	Table 11 presents the main hyperparameters used for each experiment. Depth: number of layers. H: number of input/output features. P: Latent size. J: number of blocks used for the initialization of A (see Section B.1.1). Dropout: dropout rate. LR: global learning rate. SSM LR: the SSM learning rate. B: batch size. Epochs: max epochs set for the run. WD: weight decay.