Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

SaFARi: State-Space Models for Frame-Agnostic Representation

Authors: Hossein Babaei, Mel White, Sina Alemohammad, Richard Baraniuk

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate that Sa FARi can generate SSMs for function approximation over any frame or basis by choosing examples that are non-orthogonal, incomplete, or redundant. We then evaluate Sa FARi-generated state-space models on some sample datasets online function approximation, benchmarking against established baselines.
Researcher Affiliation	Academia	Hossein Babaei EMAIL Department of Electrical and Computer Engineering Rice University; Mel White EMAIL Department of Electrical and Computer Engineering Rice University; Sina Alemohammad EMAIL Department of Electrical and Computer Engineering Rice University; Richard G. Baraniuk EMAIL Department of Electrical and Computer Engineering Rice University
Pseudocode	No	The paper includes mathematical formulations and derivations but does not contain any clearly labeled pseudocode or algorithm blocks with structured, code-like steps.
Open Source Code	Yes	Code to replicate the results of this section, as well as generate SSMs with arbitrary frames is provided at: https://github.com/echbaba/safari-ssm.
Open Datasets	Yes	S&P 500: We use the daily S&P 500 index as a broad, large-cap U.S. equities benchmark over the last decade: from August 2015 to August 2025 (Yahoo Finance (2025)). The series consists of end-of-day levels for the price index.
Dataset Splits	No	The paper describes data preparation strategies like collecting 'overlapping sequences of 500 samples' and 'resampled into 4,000 samples' for the S&P 500 dataset, and setting a 'window size...at 10% of the input signal length' for the translated case. However, it does not explicitly provide traditional training, validation, and test dataset splits with specific percentages or counts.
Hardware Specification	No	The paper mentions general categories of hardware like 'parallel hardware such as GPUs' in discussions about computational complexity, but it does not specify any exact GPU/CPU models, processor types, or memory amounts used for running the experiments.
Software Dependencies	No	The paper mentions 'generalized bilinear transform (GBT)' and 'Adam optimizer' but does not provide specific version numbers for these or any other software components used in the experiments.
Experiment Setup	Yes	Both scaled and translated versions were evaluated with N = 32, 64, 128, where N is the size of the signal representation. For the translated case, the window size is set at 10% of the input signal length. The LSTM and GRU models are trained using an Adam optimizer (Kingma (2014)) until they converge, and the final validation performance is shown in Fig. 9.