Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Uncovering the Spectral Bias in Diagonal State Space Models

Authors: Ruben Solozabal, Velibor Bojkovic, Hilal AlQuabeh, Kentaro Inui, Martin Takac

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experimental evaluation proceeds as follows. First, we introduce a motivating example in the Continuous Copying task. Then, we utilize s CIFAR to probe the inductive biases that SSMs exhibit when learning on serialized image data. Finally, we demonstrate the benefits of our S4D-DFou T initialization across the Long Range Arena benchmark [17], and further ablation datasets as the Speech Commands dataset [29]. Details of the experimental settings are provided in the Appendix C.
Researcher Affiliation Academia Ruben Solozabal MBZUAI EMAIL Velibor Bojkovic MBZUAI EMAIL Hilal Al Quabeh MBZUAI, RIKEN AIP EMAIL Kentaro Inui MBZUAI, RIKEN AIP EMAIL Martin Takáˇc MBZUAI EMAIL
Pseudocode No The paper includes mathematical formulations and proofs (e.g., Proposition 1 and its proof) but does not contain any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes The codebase upon which the experimental part was built is also publicly available, while the custom part of our code is added in the supplementary material.
Open Datasets Yes Our experimental evaluation proceeds as follows... Long Range Arena benchmark [17], and further ablation datasets as the Speech Commands dataset [29]. ... The serialized CIFAR-10 (s CIFAR) dataset... The BIDMC dataset [32] consists of continuous physiological signals...
Dataset Splits No The paper mentions using well-known benchmarks and datasets like LRA, s CIFAR, Speech Commands, and BIDMC. It describes data preprocessing steps such as padding sequences to maximum lengths (e.g., List Ops to 2048, Text to 4096) and standardization, but does not explicitly state the specific training, validation, and test splits (e.g., percentages or sample counts) used for these datasets within the paper.
Hardware Specification No We provide general framework (GPUs used in the experiments) in Appendix C, but we do not report the running time and memory consumption during training.
Software Dependencies No The codebase upon which the experimental part was built is also publicly available, while the custom part of our code is added in the supplementary material. Original S4 sourcecode from https://github.com/state-spaces/s4 is under Apache-2.0 license.
Experiment Setup Yes The S4D-DFou T hyperparameter configuration we adopt in the experimentation is provided in Table 5. Table 5: Hyperparameters used for the S4D-DFou T reported results. L denotes the number of layers; H, the embedding size; N, the hidden dimension; Dropout, the dropout rate; Lr, the global learning rate; Bs, the batch size; Epochs, the maximum number of training epochs; WD, weight decay; and (ξmin,ξmax), the range of decay rate values.