Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Permutation Equivariant Neural Controlled Differential Equations for Dynamic Graph Representation Learning

Authors: Torben Berndt, Benjamin Walker, Tiexin Qin, Jan Stühmer, Andrey Kormilitzin

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we evaluate the PENG-CDE model on a range of synthetic and real-world tasks. First, we replicate the experiments from [48] and compare them with the non-permutation-equivariant version. Next, we evaluate the model on well-established real-world dynamic graph benchmarks against other common approaches. Finally, we conduct an ablation study to examine the weighting of different basis terms of EΣn(2, 1)1 in Appendix 4.3.
Researcher Affiliation	Academia	1Heidelberg Institute for Theoretical Studies, Heidelberg, Germany 2Mathematical Institute, University of Oxford, UK 3City University of Hong Kong, Hong Kong 4IAR, Karlsruhe Institute of Technology, Karlsruhe, Germany 5Department of Psychiatry, Warneford Hospital, Oxford, UK
Pseudocode	No	The paper describes the proposed model and its components using mathematical equations and textual explanations, such as in Section 3, 'Permutation equivariant neural graph controlled differential equations', but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	2The source code is available at: https://github.com/hits-mli/perm-equiv-graph-neural-cdes and An anonymised version of the source code used to run the experiments is provided at https://anonymous.4open.science/r/perm_equiv_gn_cdes-BBF8/README.md.
Open Datasets	Yes	First, we consider two node regression tasks from Py Torch Geometric Temporal [55] and then two node affinity prediction tasks from the Temporal Graph Benchmark (TGB) [29, 24]. The TGB and Pytorch Geometric Temporal datasets used in Section 4 are publicly available.
Dataset Splits	Yes	The final 20 snapshots are allocated for extrapolation validation, while from the remaining 100, a random subset of 20 is used for interpolation validation and the remaining 80 for training. For each setting, we generate 50 trajectories each for training, validation, and test sets.
Hardware Specification	Yes	All experiments were conducted on a computing cluster equipped with NVIDIA H100 (80 GB) and H200 (141 GB) GPUs. Each node in the cluster features 64-core AMD EPYC 9334 CPUs running at 3.90 GHz, along with 256 GB of RAM.
Software Dependencies	No	The implementation is written in the Python programming language [62], and uses the JAX framework [4]. Key dependencies include Diffrax [34] for differentiable ODE solvers, Equinox [35] for neural network modules in JAX, Optax [4] for optimisation, and Lineax [50] for linear algebra routines. Additional dependencies include Num Py [28], Exca [51], Py Torch [55], and Py Torch Geometric [23]. The paper lists several software packages but does not provide specific version numbers for them.
Experiment Setup	Yes	The models are trained for 2000 epochs using the Adam optimiser [37] with a learning rate of 10 2 and weight decay of 10 4. All models are trained over 200 epochs with an early stopping criterion with a patience of 15 and a minimum epoch of 20. All hyperparameters have been selected from the ranges in Table 7 using a grid search. For the chosen values see Tables 8, 9 and 10.