Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

How Particle System Theory Enhances Hypergraph Message Passing

Authors: Yixuan Ma, Kai Yi, Pietro Lió, Shi Jin, Yuguang Wang

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We prove theoretically that our approach mitigates oversmoothing by maintaining a positive lower bound on the hypergraph Dirichlet energy during propagation and thus to enable hypergraph message passing to go deep. Empirically, our models demonstrate competitive performance on diverse real-world hypergraph node classification tasks, excelling on both homophilic and heterophilic datasets. 6 Experiments 6.2 Node Classifications on Hypergraphs 6.3 Ablation Studies
Researcher Affiliation	Academia	Yixuan Ma Shanghai Jiao Tong University EMAIL Kai Yi University of Cambridge EMAIL Pietro Liò University of Cambridge EMAIL Shi Jin Shanghai Jiao Tong University EMAIL Yu Guang Wang Shanghai Jiao Tong University EMAIL
Pseudocode	Yes	A Algorithms Complexity Analysis. Here, we analyze the computational complexity of one layer in HAMP-I and HAMP-II. Analytically, the time complexity is O \|V\|\|E\|c2 + \|V\|c , where \|V\|, \|E\| and c are the number of nodes, number of hyperedges and number of hidden dimension, respectively. However, the incidence matrix H is a sparse matrix, so the time complexity is O (tr(Dv) + tr(De))c2 + \|V\|c , where tr(Dv) is the sum of the degrees of all nodes and tr(De) is the sum of the number of nodes contained in all hyperedges. The detailed process of HAMP-I and HAMP-II are shown in Algorithm 1 and Algorithm 2. Algorithm 1 The HAMP-I Algorithm for Hypergraph Node Classification. Algorithm 2 The HAMP-II Algorithm for Hypergraph Node Classification.
Open Source Code	Yes	Empirically, our models demonstrate competitive performance on diverse real-world hypergraph node classification tasks, excelling on both homophilic and heterophilic datasets. Source code is available at the link.
Open Datasets	Yes	6.1 Experiment Setup Datasets. Following ED-HNN[43], the real-world hypergraph benchmarking datasets span diverse domains, scales, and heterophiilic levels. They can be divided into two groups based on homophily. The homophilic hypergraphs include academic citation networks (Cora, Citeseer, and Pubmed) and co-authorship networks (Cora-CA and DBLP-CA). The heterophilic hypergraphs cover legislative voting records (Congress, House, and Senate) and retail relationships (Walmart).
Dataset Splits	No	The paper mentions using "real-world hypergraph benchmarking datasets" and following the "same training recipe as ED-HNN". While these benchmarks typically have standard splits for node classification, the paper itself does not explicitly provide the percentages, sample counts, or methodology for splitting the data into training, validation, and test sets within its text or appendices. It refers to external work for the training recipe but does not detail the splits.
Hardware Specification	Yes	All experiments are implemented on an NVIDIA RTX 4090 GPU with Pytorch.
Software Dependencies	No	The paper mentions 'Pytorch' as a software dependency. However, it does not specify a version number for Pytorch or any other software component used in the experiments, which is required for reproducible software dependency details.
Experiment Setup	Yes	To ensure fairness, we follow the same training recipe as ED-HNN. Specifically, we train the model for 500 epochs using the Adam optimizer with the learning rate of 0.001 and no weight decay during the training phases. And we apply early stopping with a patience of 50. For the stability, we run 10 trials with different seed and report the results of mean and the standard deviation. We explore the parameter space by grid search, where the search ranges for each critical hyperparameter are delineated below: Dropout rate in {0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9}; Layer number of classifier in {1, 2, 3}; The hidden dimension of classifier in {128, 256, 512}; Hidden dimension of model in {128, 256, 512, 1024}; step size of solver in {0.09, 0.1, 0.15, 0.2, 0.25}; γ of repulsive force in {0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.07, 0.08, 0.09, 0.1, 0.11, 0.12, 0.13, 0.14, 0.15}; Initial values of learnable parameters δ of damping term in {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15}; Initial values of learnable parameters ϵ of noise term in {0, 0.1, 0.3}; Tab. 7 and Tab. 8 summarize the best hyperparameters on standard hypergraph benchmarks using HAMP-I and HAMP-II, respectively.