Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Scaling Epidemic Inference on Contact Networks: Theory and Algorithms
Authors: Guanghui Min, Yinhan He, Chen Chen
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we conduct extensive experiments on six real-world datasets, demonstrating our method s effectiveness and robustness in estimating the nodes final state distribution. Specifically, our proposed method consistently produces accurate estimates aligned with results from a large number of MC simulations, while maintaining a runtime comparable to a single MC simulation. |
| Researcher Affiliation | Academia | Guanghui Min Yinhan He Chen Chen University of Virginia EMAIL |
| Pseudocode | Yes | Algorithm 1 Sketch of PID Algorithm 2 Sketch of RAPID Algorithm 1: Monte Carlo Simulation for SIR Algorithm 2: PID: Probabilistic Infection Dynamics Algorithm 3: RAPID: Residual-Accelerated Propagation for Infection Dynamics |
| Open Source Code | Yes | Our code and datasets are available at https://github.com/Guanghui Min/RAPID. |
| Open Datasets | Yes | Our code and datasets are available at https://github.com/Guanghui Min/RAPID. We use graphs from diverse domains, including a real-world hospital contact network (carilion-Hospital [1]), a real-world HIV transmission network (hiv-Trans [40]), communication networks (email-Enron [35, 29], email-Eu All [34]), and social networks (soc-Epinions [52], soc-Pokec [54]). |
| Dataset Splits | No | The paper does not explicitly provide traditional training/test/validation dataset splits. Instead, it describes estimating ground truth infection probabilities by running a large number of Monte Carlo simulations (e.g., 50-run or 1000-run MC simulations) over entire datasets, rather than splitting the datasets themselves for distinct training and evaluation phases of a model. |
| Hardware Specification | Yes | All experiments were conducted on a machine equipped with a Intel(R) Xeon(R) Gold 6248 CPU @ 2.50GHz processor with 376 GB Memory. |
| Software Dependencies | No | All algorithms are implemented in Python with the Networkx library. No specific version numbers for Python or the Networkx library are provided, which are necessary for reproducible software dependencies. |
| Experiment Setup | Yes | For RAPID, the number of preheat steps p is set to 20, and the propagation residual threshold is set to 10 3. ... For reporting and baseline comparisons, we adopt the parameter setting of Nipah Virus (β = 1/18, γ = 1/9) with an initial infection fraction of α = 0.01. |