Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Inferring stochastic dynamics with growth from cross-sectional data
Authors: Stephen Zhang, Suryanarayana Maddu, Xiaojie Qiu, Victor Chardès
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We showcase the applicability of our approach through evaluation on a range of simulated and real single-cell RNA-seq datasets. Comparing to several existing methods, we find our method achieves higher accuracy while enjoying a simple two-step training scheme. |
| Researcher Affiliation | Academia | Stephen Zhang: School of Mathematics and Statistics, University of Melbourne Suryanarayana Maddu Center for Computational Biology, Flatiron Institute Xiaojie Qiu Department of Genetics, Stanford University School of Medicine Victor Chardès: Center for Computational Biology, Flatiron Institute |
| Pseudocode | Yes | We summarise the UPFI algorithm in Alg. 1. |
| Open Source Code | Yes | Code is available at https://github.com/zsteve/UPFI. |
| Open Datasets | Yes | Data for the study of [1] are available from the original publication using the GEO database with accession number GSE140802. The lineage tracing data used in the experiments is publicly accessible on the GEO database with the query code GSE140802. |
| Dataset Splits | No | The paper describes simulating data and using existing single-cell RNA-seq datasets (e.g., from GEO database GSE140802). For these datasets, it specifies selections of cells (e.g., 86,416 cells contributing to a trajectory) or uses PCA embeddings, but it does not describe explicit training, validation, and test splits for machine learning model evaluation, nor does it refer to standard predefined splits for the experimental tasks. |
| Hardware Specification | Yes | All model training was carried out using a NVIDIA L40S GPU. |
| Software Dependencies | No | The paper mentions using "Py Torch" and "Geom Loss package" without specific version numbers. It also refers to "dyn.pp.recipe_monocle function from the Dynamo package [43]" and "Scanpy: large-scale single-cell gene expression data analysis. Genome biology 19, 1 5 (2018)", again without specific versions for the packages used. Therefore, it does not provide specific ancillary software versions. |
| Experiment Setup | Yes | Our architecture and hyperparameter choices are listed in Table 6. Table 5: Hyperparameter settings: score networks Table 6: Hyperparameter settings: dynamics. |