Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

ZeroS: Zero‑Sum Linear Attention for Efficient Transformers

Authors: Jiecheng Lu, Xu Han, Yan Sun, Viresh Pati, Yubin Kim, Siddhartha Somani, Shihao Yang

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate both linear-time Zero S and quadratic-time Zero S-SM on recent in-context learning benchmarks, along with experiments on NLP, image, and time series tasks. In all experiments, we directly replaced the multi-head attention module with Zero S under original benchmark settings, preserving all other components (MLP/GLU, embeddings, hyperparameters) to ensure strict alignment with previous standards. (Section 4, Experiments)
Researcher Affiliation	Collaboration	Jiecheng Lu1 ,Xu Han2,Yan Sun1,Viresh Pati1,Yubin Kim1,Siddhartha Somani1,Shihao Yang1 Georgia Institute of Technology1, Amazon Web Services2 EMAIL , EMAIL
Pseudocode	No	The paper describes the methodology and architecture through text and figures (e.g., Figure 1 illustrating the Zero S block and its components), and provides mathematical formulations for its components. However, it does not include a distinct block labeled 'Pseudocode' or 'Algorithm' with structured steps for the overall method.
Open Source Code	Yes	The code implementation is available at this link. (Abstract)
Open Datasets	Yes	We evaluate Zero S on the MAD benchmark [54], Reg Bench [57], Wiki Text-103 [60], Open Web Text2 [61], Weather [69], Solar [70], ETT [71]. All of these are well-known and cited public datasets.
Dataset Splits	Yes	We conduct language modeling on Wiki Text-103 following [60] s setup, with results in Table 2. We follow the setup of [56] for the MQAR task, which evaluates models ability to learn induction heads for in-context associative recall. We evaluate Zero S on Reg Bench [57] following the original experimental setup (Figure 2). Following the setup in [65], we evaluate Zero S on time series forecasting tasks.
Hardware Specification	Yes	Yes, all tasks used in this paper can be trained on the single Nvidia RTX 4090 GPU that we used. (NeurIPS Paper Checklist Q8)
Software Dependencies	No	The paper mentions using a 'code environment provided by nano GPT2' for the Open Web Text2 dataset training (Appendix A.5.5), but does not provide specific version numbers for software libraries, programming languages, or other key dependencies.
Experiment Setup	Yes	In all experiments, we directly replaced the multi-head attention module with Zero S under original benchmark settings, preserving all other components (MLP/GLU, embeddings, hyperparameters) to ensure strict alignment with previous standards. (Section 4)