Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Inferring stochastic dynamics with growth from cross-sectional data

Authors: Stephen Zhang, Suryanarayana Maddu, Xiaojie Qiu, Victor Chardès

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We showcase the applicability of our approach through evaluation on a range of simulated and real single-cell RNA-seq datasets. Comparing to several existing methods, we find our method achieves higher accuracy while enjoying a simple two-step training scheme.
Researcher Affiliation Academia Stephen Zhang: School of Mathematics and Statistics, University of Melbourne Suryanarayana Maddu Center for Computational Biology, Flatiron Institute Xiaojie Qiu Department of Genetics, Stanford University School of Medicine Victor Chardès: Center for Computational Biology, Flatiron Institute
Pseudocode Yes We summarise the UPFI algorithm in Alg. 1.
Open Source Code Yes Code is available at https://github.com/zsteve/UPFI.
Open Datasets Yes Data for the study of [1] are available from the original publication using the GEO database with accession number GSE140802. The lineage tracing data used in the experiments is publicly accessible on the GEO database with the query code GSE140802.
Dataset Splits No The paper describes simulating data and using existing single-cell RNA-seq datasets (e.g., from GEO database GSE140802). For these datasets, it specifies selections of cells (e.g., 86,416 cells contributing to a trajectory) or uses PCA embeddings, but it does not describe explicit training, validation, and test splits for machine learning model evaluation, nor does it refer to standard predefined splits for the experimental tasks.
Hardware Specification Yes All model training was carried out using a NVIDIA L40S GPU.
Software Dependencies No The paper mentions using "Py Torch" and "Geom Loss package" without specific version numbers. It also refers to "dyn.pp.recipe_monocle function from the Dynamo package [43]" and "Scanpy: large-scale single-cell gene expression data analysis. Genome biology 19, 1 5 (2018)", again without specific versions for the packages used. Therefore, it does not provide specific ancillary software versions.
Experiment Setup Yes Our architecture and hyperparameter choices are listed in Table 6. Table 5: Hyperparameter settings: score networks Table 6: Hyperparameter settings: dynamics.