Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

GMV: A Unified and Efficient Graph Multi-View Learning Framework

Authors: Qipeng zhu, Jie Chen, Jian Pu, Junping Zhang

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments demonstrate that GMV surpasses other augmentation and ensemble techniques for GNNs and Graph Transformers across various graph classification scenarios. The open source code can be found in https://github.com/smurf-1119/GMV. 4 Experiments Method IMDBB PROTEINS NCI1 NCI109 REDDITB IMDBM REDDIT-M5 COLLAB #graphs 1000 1113 4110 4127 2000 1500 4999 5000 #classes 2 2 2 2 2 2 3 5 #avg nodes 19.8 39.1 29.9 29.7 429.6 13.0 508.5 74.5 #avg edges 96.5 72.8 32.3 32.1 497.8 65.9 594.9 2457.2
Researcher Affiliation	Academia	1Shanghai Key Laboratory of Intelligent Information Processing, College of Computer Science and Artificial Intelligence, Fudan University 2 College of Computer and Data Science, Fuzhou University 3Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University EMAIL, EMAIL, EMAIL
Pseudocode	Yes	The specific process is outlined in Algorithm 1. We first pick a random root node from the graph G. We consider both structural and semantic information of G by merging different node candidate sets [44]. Depth-First-Search (DFS) algorithm and Breath-First-Search (BFS) algorithm [45] can easily extract the original topology structure of G. And the PPR algorithm considers semantic information by iteratively calculating the importance score of every node in G [33]. Therefore, we respectively use DFS, BFS and PPR methods to gain sampling node set {VBFS, VDFS, VPPR} from G. We set w as the maximum searching steps for DFS and BFS algorithms. To preserve those important nodes, we calculate the affinity personalized pagerank score matrix SPPR [41] as follows: Algorithm 1 Structure Enhanced PPR Subgraph Sampling Input: Graph G =< V, E, A, X >, augmentation ratio of p (0, 1), structure augmentation ratio of q, number of walks w Output: Ordered node set V
Open Source Code	Yes	Our experiments demonstrate that GMV surpasses other augmentation and ensemble techniques for GNNs and Graph Transformers across various graph classification scenarios. The open source code can be found in https://github.com/smurf-1119/GMV.
Open Datasets	Yes	Table 1 and Table 2 outlines the specifics of eight real-world datasets from the TUDatasets benchmark [51] and three datasets from open graph benchmark (OGB) [52].
Dataset Splits	Yes	For each method, we conduct 10-fold cross-validation experiments on each dataset from TUDataset Benchmark, calculating the mean accuracy and standard deviation to derive results. Following S-Mixup [7], the datasets are split into training, validation and test sets. Specifically, 80% for training, 10% for validation, and 10% for testing. For the datasets from OGB Graph Banchmark [50], we adopt the public train/validation/test splits, and report the results of the test set.
Hardware Specification	Yes	All experiments are conducted on NVIDIA 3090TI GPUs.
Software Dependencies	No	The paper does not explicitly mention specific version numbers for software dependencies such as programming languages, libraries, or frameworks (e.g., Python, PyTorch, TensorFlow, CUDA).
Experiment Setup	Yes	For each method, we conduct 10-fold cross-validation experiments on each dataset from TUDataset Benchmark, calculating the mean accuracy and standard deviation to derive results. Following S-Mixup [7], the datasets are split into training, validation and test sets. Specifically, 80% for training, 10% for validation, and 10% for testing. For the datasets from OGB Graph Banchmark [50], we adopt the public train/validation/test splits, and report the results of the test set. We conduct each experiment three times and utilize area under curve (AUC) as measurement on these OGB graph datasets. All experiments are conducted on NVIDIA 3090TI GPUs.