Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Asymptotics of SGD in Sequence-Single Index Models and Single-Layer Attention Networks

Authors: Luca Arnaboldi, Bruno Loureiro, Ludovic Stephan, Florent Krzakala, Lenka Zdeborová

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	All our formal claim are supported by rigorous proofs, as well as numerical experiments; the code developed is available at https://github.com/Ide PHICS/Sequence-Single-Index.
Researcher Affiliation	Academia	Luca Arnaboldi Ide Phics Laboratory EPFL Lausanne, Switzerland Bruno Loureiro Département d Informatique École Normale Supérieure PSL Paris, France Ludovic Stephan ENS AI University Rennes Rennes, France Florent Krzakala Ide Phics Laboratory EPFL Lausanne, Switzerland Lenka Zdeborová SPOC Laboratory EPFL Lausanne, Switzerland
Pseudocode	No	The paper describes theoretical models, derivations, and mathematical equations. It does not present any pseudocode or algorithm blocks.
Open Source Code	Yes	All our formal claim are supported by rigorous proofs, as well as numerical experiments; the code developed is available at https://github.com/Ide PHICS/Sequence-Single-Index.
Open Datasets	No	To derive a sharp characterization of the sample complexity and convergence rate of SGD for the single-layer attention mechanism in eq. (1), we assume that the sequence data (X, y) is generated from the following Gaussian sequence single-index (SSI) model
Dataset Splits	No	We assume training data (X, y) RL d Rk is independently drawn from a Gaussian Sequence Single Index (SSI) model
Hardware Specification	Yes	The experiments run on a Mac Studio M2 Ultra, within at most few hours for the largest ones.
Software Dependencies	No	The code is written in Python, using the libraries numpy, scipy, torch and matplotlib. hydra is used to manage the configuration files.
Experiment Setup	Yes	The simulations in Figure 5 are performed with d = 1000... In Figure 12 we reproduce Figure 5 for d = 100... The probability is computed over 64 SGD runs... Averaged over 25 runs... The gain is proportional to L, as predicted. g(z ) = PL i=1 He2(z ,i), d = 1000, σ = Re LU. (Section E.1) γtied = γuntied = costant with L = γ0. ... γtied = γuntied = γ0 = 0.005. ... In our experiments, we used Nint = 17, while for the phase diagram in Figure 4 we used Nint = 19.