Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Learning Individual Behavior in Agent-Based Models with Graph Diffusion Networks

Authors: Francesco Cozzi, Marco Pangallo, Alan Perotti, André Panisson, Corrado Monti

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate our approach on two ABMs (Schelling s segregation model and a Predator-Prey ecosystem) showing that it replicates individual-level patterns and accurately forecasts emergent dynamics beyond training. Our results demonstrate the potential of combining diffusion models and graph learning for data-driven ABM simulation. In this section, we present different experiments to assess and demonstrate our framework s ability to learn micro-level agent behaviors and faithfully reproduce emergent system-level dynamics.
Researcher Affiliation	Collaboration	Francesco Cozzi Sapienza University, Rome, Italy CENTAI, Turin, Italy EMAIL Marco Pangallo CENTAI, Turin, Italy EMAIL Alan Perotti CENTAI, Turin, Italy EMAIL André Panisson CENTAI, Turin, Italy EMAIL Corrado Monti CENTAI, Turin, Italy EMAIL
Pseudocode	Yes	Algorithm 1: Training Procedure Algorithm 1 Generation Algorithm 2 Schelling Model of Segregation Algorithm 3 Predator-Prey model
Open Source Code	Yes	All implementation and reproducibility details are provided in the Supplementary Materials. Full code to reproduce our experiments is available at http://github.com/fracozzi/ABM-Graph-Diffusion-Network.
Open Datasets	No	Our method is trained on data traces, it can seamlessly integrate empirical observations alongside simulated data, thus being potentially applicable to real-world scenarios. In this sense, our work represents a first step toward developing a comprehensive methodology for creating easy-to-use, learnable ABMs. The paper uses simulation data generated from canonical ABMs (Schelling s segregation model and a Predator-Prey ecosystem), rather than publicly available external datasets.
Dataset Splits	Yes	For each ABM and parameter setting, we simulate Ttrain = 10 main-branch steps with R = 500 stochastic branches per step, yielding the training ramification as in Figure 1. For macro-evaluation, we run 100 independent main-branch simulations to calculate s MAPE. For micro-evaluation, we generate an out-of-sample ramification dataset of T = 25 timesteps.
Hardware Specification	Yes	All experiments were run in a cloud-based server with 15 v Cores, 180 GB of RAM, and an NVIDIA A100 80GB PCIe GPU.
Software Dependencies	No	The paper mentions optimizers (Adam) and activation functions (Leaky ReLU) with some parameters but does not specify the versions of major software libraries or frameworks (e.g., PyTorch, TensorFlow) used, which is necessary for a reproducible description of ancillary software.
Experiment Setup	Yes	We train both surrogate and ablations for 100 epochs using Adam with learning rate 10 5 for the diffusion model and Adam with learning rate 2 10 5 for the GNN, batch size equal to number of agents, and diffusion hyper-parameters τmax = 100 (more information in Supplementary Section A).