Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

FlowNet: Modeling Dynamic Spatio-Temporal Systems via Flow Propagation

Authors: Yutong Feng, Xu Liu, Yutong Xia, Yuxuan Liang

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments demonstrate that Flow Net significantly outperforms existing state-of-the-art approaches on seven metrics in the modeling of three real-world systems, validating its efficiency and physical interpretability.
Researcher Affiliation	Academia	Yutong Feng1, Xu Liu2, Yutong Xia2, Yuxuan Liang1 1The Hong Kong University of Science and Technology (Guangzhou) 2National University of Singapore
Pseudocode	No	The paper describes the methodology in prose and presents architectural diagrams (e.g., Figure 2), but it does not include a formal pseudocode block or algorithm listing.
Open Source Code	Yes	We provide links to our anonymised code in the Appendix.
Open Datasets	Yes	We evaluate our approach on three datasets (PEMS04F [36], Deep Base [39], SINPA [40]) of dynamic spatio-temporal systems, spanning transportation, hydrology, and urban mobility domains. ... All the datasets we use are open source.
Dataset Splits	Yes	The datasets are partitioned as follows: PEMS04F (70% train, 10% validation, 20% test), Deep Base (pre-2021 train, 2021-2022 validation/test), and SINPA (pre-May-2021 train, May-June 2021 validation/test).
Hardware Specification	Yes	We conduct all experiments on one NVIDIA A100 80GB GPU.
Software Dependencies	No	The paper mentions using the Adam optimizer and that baseline code implementations are based on 'time-series-library [59] and Large ST [60]', but specific version numbers for these or other critical software components (e.g., Python, PyTorch, CUDA) are not provided.
Experiment Setup	Yes	We conduct all experiments on one NVIDIA A100 80GB GPU. The Adam optimizer is utilized to train our model, and the batch size is 8. The learning rate starts from 1 × 10−3, halved every 20 epochs until the 60th epoch, and we start early stopping at the 20th epoch of training. For Flow Net, we stack 2 layers of FAM for Flow Net and set up 16 experts inside the M-MLP. the Flow token uses 4 heads, and all hidden layer dimensions are set to 64. ... For the linear layers in M-MLP, we employed Kaiming initialization to set their initial parameters. All other linear layers in the model were initialized using Xavier initialization. Ge LU is employed as the activation function in M-MLP. ... During training, we employ MAE as the loss function for Flow Net and all baselines. Our Early Stopping mode protocol monitors the MAE metric at the validation set and terminates training when no improvement is observed for 10 consecutive epochs relative to the best recorded value, indicating potential overfitting.