Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Learning Individual Behavior in Agent-Based Models with Graph Diffusion Networks

Authors: Francesco Cozzi, Marco Pangallo, Alan Perotti, André Panisson, Corrado Monti

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate our approach on two ABMs (Schelling s segregation model and a Predator-Prey ecosystem) showing that it replicates individual-level patterns and accurately forecasts emergent dynamics beyond training. Our results demonstrate the potential of combining diffusion models and graph learning for data-driven ABM simulation. In this section, we present different experiments to assess and demonstrate our framework s ability to learn micro-level agent behaviors and faithfully reproduce emergent system-level dynamics.
Researcher Affiliation Collaboration Francesco Cozzi Sapienza University, Rome, Italy CENTAI, Turin, Italy EMAIL Marco Pangallo CENTAI, Turin, Italy EMAIL Alan Perotti CENTAI, Turin, Italy EMAIL André Panisson CENTAI, Turin, Italy EMAIL Corrado Monti CENTAI, Turin, Italy EMAIL
Pseudocode Yes Algorithm 1: Training Procedure Algorithm 1 Generation Algorithm 2 Schelling Model of Segregation Algorithm 3 Predator-Prey model
Open Source Code Yes All implementation and reproducibility details are provided in the Supplementary Materials. Full code to reproduce our experiments is available at http://github.com/fracozzi/ABM-Graph-Diffusion-Network.
Open Datasets No Our method is trained on data traces, it can seamlessly integrate empirical observations alongside simulated data, thus being potentially applicable to real-world scenarios. In this sense, our work represents a first step toward developing a comprehensive methodology for creating easy-to-use, learnable ABMs. The paper uses simulation data generated from canonical ABMs (Schelling s segregation model and a Predator-Prey ecosystem), rather than publicly available external datasets.
Dataset Splits Yes For each ABM and parameter setting, we simulate Ttrain = 10 main-branch steps with R = 500 stochastic branches per step, yielding the training ramification as in Figure 1. For macro-evaluation, we run 100 independent main-branch simulations to calculate s MAPE. For micro-evaluation, we generate an out-of-sample ramification dataset of T = 25 timesteps.
Hardware Specification Yes All experiments were run in a cloud-based server with 15 v Cores, 180 GB of RAM, and an NVIDIA A100 80GB PCIe GPU.
Software Dependencies No The paper mentions optimizers (Adam) and activation functions (Leaky ReLU) with some parameters but does not specify the versions of major software libraries or frameworks (e.g., PyTorch, TensorFlow) used, which is necessary for a reproducible description of ancillary software.
Experiment Setup Yes We train both surrogate and ablations for 100 epochs using Adam with learning rate 10 5 for the diffusion model and Adam with learning rate 2 10 5 for the GNN, batch size equal to number of agents, and diffusion hyper-parameters τmax = 100 (more information in Supplementary Section A).