Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

GRAVER: Generative Graph Vocabularies for Robust Graph Foundation Models Fine-tuning

Authors: Haonan Yuan, Qingyun Sun, Junhua Shi, Xingcheng Fu, Bryan Hooi, Jianxin Li, Philip S Yu

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate the superiority of GRAVER over effectiveness, robustness, and efficiency on downstream few-shot node and graph classification tasks compared with 15 state-of-the-art baselines.
Researcher Affiliation	Academia	1SKLCCSE, School of Computer Science and Engineering, Beihang University 2Key Lab of Education Blockchain and Intelligent Technology, Guangxi Normal University 3School of Computing, National University of Singapore 4Department of Computer Science, University of Illinois, Chicago
Pseudocode	Yes	Algorithm 1: Pre-training pipeline of GRAVER. Algorithm 2: Fine-tuning pipeline of GRAVER.
Open Source Code	Yes	Codes are available at: https://github.com/RingBDStack/GRAVER.
Open Datasets	Yes	Cora [65], Cite Seer [16], Pub Med [73], and the large-scale ogbn-ar Xiv [28]. Co-purchase Domain: ogbn-Tech and ogbn-Home from large-scale co-purchase network [28]. Web Link Domain: Wiki-CS [66]
Dataset Splits	Yes	We evaluate node and graph classification under the m-shot setting, where m labeled samples per class are randomly selected. As each dataset contains a single large graph, we follow prior work [61, 118, 123] to extract ego-graphs centered on target nodes for graph classification, assigning labels based on the central nodes. Accuracy (Acc.) is used for evaluation.
Hardware Specification	Yes	CPU: Intel(R) Xeon(R) Platinum 8358 CPU@2.60GHz with 1TB DDR4 of Memory. GPU: NVIDIA Tesla A100 SMX4 with 80GB of Memory.
Software Dependencies	Yes	Software: CUDA 10.1, Python 3.8.12, Py Torch3 1.9.1, Py Torch Geometric4 2.0.1.
Experiment Setup	Yes	We pre-train GRAVER for up to 10,000 epochs, with early stopping applied if training loss does not decrease for 50 consecutive epochs to ensure efficiency without compromising performance. The aligned feature dimension is set to 64, and the hidden dimension of the encoder is 256. The number of disentangled channels (factors) K is set to 4, the number of routing iterations T is set to 3, and the number of convolution layers is set to 2. The trade-off hyperparameter λ is tuned within the range of 0 to 1. For optimization, we adopt the Adam optimizer [39], with the learning rate and weight decay selected from the range of 1e-5 to 1e-2 via grid search on the validation set.