Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Defining and Discovering Hyper-meta-paths for Heterogeneous Hypergraphs

Authors: Yaming Yang, Ziyu Zheng, Weigang Lu, Zhe Wang, Xinyan Huang, Wei Zhao, Ziyu Guan

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments show that HHNN can achieve significantly better performance than state-of-the-art baselines, and the discovered hyper-meta-paths bring good interpretability for the model predictions. To facilitate the reproducibility of this work, we provide our dataset as well as source code at: https://github.com/zhengziyu77/HHNN. In this section, we conduct comprehensive experiments on two real-world datasets. Please see Appendix A for dataset details, see Appendix B for the details of the used baseline methods, and see Appendix C for the detailed experimental settings.
Researcher Affiliation	Academia	1School of Computer Science and Technology, Xidian University, China 2Department of Civil and Environmental Engineering, The Hong Kong University of Science and Technology, Hong Kong, China 3Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, School of Artificial Intelligence, Xidian University, China {yym@, zhengziyu@stu., wglu@stu., zwang@stu., ywzhao@mail., zyguan@}xidian.edu.cn
Pseudocode	Yes	The overall training process of our proposed HHNN is shown in Algorithm 1. Algorithm 1: The training process of HHNN. Input : The heterogeneous hypergraph G = (V, E, H, W, ϕ, ψ), The number of model layers N. Output : The embeddings of all the nodes and hyperedges. 1 Randomly initialize all the trainable model parameters; 2 for n = 1, ..., N do 3 # from nodes to hyperedges; 4 Perform α-Attention aggregation according to Eqs. (1-3); 5 # from hyperedges to nodes; 6 Perform β-Attention (intra-type) aggregation according to Eqs. (4-6); 7 Perform γ-Attention (inter-type) aggregation according to Eqs. (7-8); 9 Compute loss according to Eq. (9); 10 Update model parameters by gradient descent;
Open Source Code	Yes	To facilitate the reproducibility of this work, we provide our dataset as well as source code at: https://github.com/zhengziyu77/HHNN. To facilitate the reproducibility of this work, we provide our dataset as well as source code at: https://github.com/ zhengziyu77/HHNN.
Open Datasets	Yes	To facilitate the reproducibility of this work, we provide our dataset as well as source code at: https://github.com/zhengziyu77/HHNN. Movielens is a real-world movie dataset, which was originally released by Group Lens2, a research laboratory at University of Minnesota. ... 2https://files.grouplens.org/datasets/hetrec2011/hetrec2011-movielens-2k-v2.zip. Olist is a real-world e-commercial dataset. It contains the orders of Olist Store, a Brazilian ecommerce platform. The raw dataset was originally released at Kaggle3, a data science competition platform. ... 3https://www.kaggle.com/datasets/olistbr/brazilian-ecommerce.
Dataset Splits	Yes	Specifically, on each dataset, we randomly select τ% ground-truth labels as the training set, and the rest (1 τ)% are divided equally as the validation set and the test set, where τ {20, 40}.
Hardware Specification	Yes	All the experiments are conducted on Intel(R) Core(TM) i9-10980XE CPU and NVIDIA TITAN RTX GPU with 24GB GPU memory.
Software Dependencies	No	We use the Pytorch framework to implement our proposed HHNN. The number of model layers is searched in {1, 2, 3, 4, 5}, the dimensionality of hidden node/hyperedge representations is searched in {8, 16, 32, 64, 128}, the number of attention heads is searched in {1, 2, 4, 8}. We use the Adam optimizer to optimize all the trainable model parameters, which are randomly initialized by the Xavier uniform distribution [16]. For ease of tuning, the optimizer settings are the same for both datasets. Specifically, the learning rate is set to 0.001, the weight decay is set to 0.0, and the attention dropout rate is set to 0.5.
Experiment Setup	Yes	We use the Pytorch framework to implement our proposed HHNN. The number of model layers is searched in {1, 2, 3, 4, 5}, the dimensionality of hidden node/hyperedge representations is searched in {8, 16, 32, 64, 128}, the number of attention heads is searched in {1, 2, 4, 8}. We use the Adam optimizer to optimize all the trainable model parameters, which are randomly initialized by the Xavier uniform distribution [16]. For ease of tuning, the optimizer settings are the same for both datasets. Specifically, the learning rate is set to 0.001, the weight decay is set to 0.0, and the attention dropout rate is set to 0.5.