Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

SING: SDE Inference via Natural Gradients

Authors: Amber Hu, Henry Smith, Scott Linderman

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	SING outperforms prior methods in state inference and drift estimation on a variety of datasets, including a challenging application to modeling neural dynamics in freely behaving animals. Altogether, our results illustrate the potential of SING as a tool for accurate inference in complex dynamical systems, especially those characterized by limited prior knowledge and non-conjugate structure. 5 Experiments 5.1 Inference on synthetic data 5.2 Drift estimation on synthetic data 5.3 Runtime comparisons 5.4 Application to modeling neural dynamics during aggression
Researcher Affiliation	Academia	Amber Hu Stanford University EMAIL Henry D. Smith Stanford University EMAIL Scott W. Linderman Stanford University EMAIL
Pseudocode	No	The paper describes steps for algorithms in prose and bullet points within the text (e.g., in Appendix G.1 "Generalizing these recursions leads to the following sequential algorithm:"), but it does not contain a clearly labeled algorithm block, figure, or section explicitly titled "Pseudocode" or "Algorithm X" as a distinct element.
Open Source Code	Yes	1 Code: https://github.com/lindermanlab/sing
Open Datasets	Yes	We apply our method to a publicly available dataset from Vinograd et al. [55], which can be found at: https://dandiarchive.org/dandiset/001037. We use the dataset labeled as sub-M0L which consists of neural activity from one trial of a mouse exhibiting aggressive behavior towards two consecutive intruders.
Dataset Splits	No	For synthetic experiments, data is simulated (e.g., "We generate 30 independent trials from an SDE..." in Appendix L.1.1), rather than splitting a fixed dataset. For the real-world application, the paper states, "We use the dataset labeled as sub-M0L which consists of neural activity from one trial..." (Appendix L.5), indicating a single trial is used for analysis without explicit training/validation/test splits as typically defined for reproducing data partitioning.
Hardware Specification	Yes	For all experiments, we fit our models using either an NVIDIA A100 GPU or NVIDIA H100 GPU. All runtime comparisons were performed on an NVIDIA A100 GPU.
Software Dependencies	No	The paper mentions using "Dynamax package [54]" and "Adam [52]" but does not specify version numbers for these software components or any other libraries. Reference [54] indicates "JAX" but also without a version.
Experiment Setup	Yes	We fit SING for 500 iterations. We use the following step size schedule inspired by [25]: ρ is log-linearly increased from 10 3 to 10 1.5 for the first 10 iterations, and then kept at 10 1.5 for the rest of the iterations. For Adam-based optimization, we sweep over initial learning rates {10 4, 5 10 4, 10 3, 5 10 3, 10 2}, fit for 20 iterations, and choose the best fit according to the final ELBO value, which in this case corresponded to learning rate 5 10 3. Our synthetic data example consists of a 2D latent nonlinear dynamical system evolving according to the Duffing equation... with parameters (α, β, γ) = (2, 1, 0.1) and initial condition x(0) = (√α/β − 0.1, 0.1). For the neural-SDE, we use a neural network with two hidden layers of size 64 and Re LU activations.