Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Cross City Traffic Flow Generation via Retrieval Augmented Diffusion Model

Authors: Yudong Li, Jingyuan Wang, Xie Yu, Peiyu Wang, Qian Huang

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on four real-world datasets demonstrate that, compared to existing generation methods, our method achieves best cross-city zero-shot performance. Our code and datasets can be found in https://github.com/lyd1881310/CRAFT. Extens ive experiments on four real-world urban datasets demonstrate the state-of-the-art (SOTA) zero-shot generation performance of our model and further validate its strong generalization ability.
Researcher Affiliation	Collaboration	Yudong Li1, Jingyuan Wang*1, 2, 3, Xie Yu1, Peiyu Wang1, Qian Huang4 1School of Computer Science and Engineering, Beihang University, Beijing, China 2School of Economics and Management, Beihang University, Beijing, China 3MIIT Key Laboratory of Data Intelligence and Management, Beihang University, Beijing, China 4Global Technical Service Dept, Huawei Technologies Co., Ltd Beijing, China EMAIL EMAIL
Pseudocode	No	The paper describes the methodology in Section 4 'Methodology' using descriptive text and mathematical equations for 'Geographic Feature Alignment', 'Retrieval-based Condition Augmentation', and 'Conditional Diffusion Backbone'. However, it does not include any clearly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code	Yes	Our code and datasets can be found in https://github.com/lyd1881310/CRAFT.
Open Datasets	Yes	Dataset: We conducted experiments on four real-world bicycle trip datasets, namely Chicago (CHI)1, Washington, D.C. (DC)2, Toronto (TRT) 3, and New York City (NYC) 4. ... 1https://divvy-tripdata.s3.amazonaws.com 2https://s3.amazonaws.com/capitalbikeshare-data 3https://ckan0.cf.opendata.inter.prod-toronto.ca 4https://s3.amazonaws.com/tripdata/2023-citibike-tripdata.zip
Dataset Splits	No	For the processed data, we have set a sliding window with a length of T to extract the training and test samples. If there are still missing values exceeding 5% in a certain sample, we will discard that sample. The paper mentions train/test samples and a sliding window for data extraction but does not specify explicit percentages or absolute counts for the overall dataset splits (e.g., 80/10/10 split).
Hardware Specification	Yes	All neural network models (including CRAFT and other baselines) are implemented in Py Torch and trained on a single NVIDIA RTX 3090 GPU. The experimental machine ran on Ubuntu 20.04.6 LTS, was equipped 24-core Intel(R) Xeon(R) Silver CPU, and had 503 GB of RAM.
Software Dependencies	No	All neural network models (including CRAFT and other baselines) are implemented in Py Torch and trained on a single NVIDIA RTX 3090 GPU. The experimental machine ran on Ubuntu 20.04.6 LTS, was equipped 24-core Intel(R) Xeon(R) Silver CPU, and had 503 GB of RAM. The paper mentions PyTorch and the Python Optimal Transport (POT) tool, and the operating system Ubuntu 20.04.6 LTS. However, specific version numbers for PyTorch and POT are not provided.
Experiment Setup	Yes	For the proposed CRAFT method, we provide the hyperparameter settings in Table 3 to facilitate the reproducibility by researchers. All these parameters are recommended values, not fixed, and can be adjusted according to the dataset and experimental environment. During training, the Adam W optimizer was used. To enhance stability, the EMA (Exponential Moving Average) mechanism was adopted to train the diffusion model.