GNNEvaluator: Evaluating GNN Performance On Unseen Graphs Without Labels

Authors: Xin Zheng, Miao Zhang, Chunyang Chen, Soheila Molaei, Chuan Zhou, Shirui Pan

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on real-world unseen and unlabeled test graphs demonstrate the effectiveness of our proposed method for GNN model evaluation. This section empirically evaluates the proposed GNNEvaluator on real-world graph datasets for node classification task.
Researcher Affiliation Academia 1Monash University, Australia, 2Harbin Institute of Technology (Shenzhen), China 3University of Oxford, UK, 4Chinese Academy of Sciences, China, 5Griffith University, Australia
Pseudocode No No pseudocode or clearly labeled algorithm blocks are present in the paper.
Open Source Code No The paper does not contain any explicit statement or link indicating that the authors' implementation code for the proposed method is open-source or publicly available.
Open Datasets Yes Datasets. We perform experiments on three real-world graph datasets, i.e., DBLPv8, ACMv9, and Citationv2, which are citation networks from different original sources (DBLP, ACM, and Microsoft Academic Graph, respectively) by following the processing of [29].
Dataset Splits Yes For each model, we train it on the training set of the observed graph under the transductive setting, until the model achieves the best node classification on its validation set following the standard training process, i.e., the well-trained GNN model.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory, or cloud instance types) used to run the experiments.
Software Dependencies No The paper mentions GNN models like GCN, Graph SAGE, GAT, GIN, and MLP, but it does not specify any software dependencies (e.g., libraries, frameworks) with version numbers.
Experiment Setup Yes For all GNN and MLP models, the default settings are : (a) the number of layers is 2; (b) the hidden feature dimension is 128; (c) the output feature dimension before the softmax operation is 16. The hyperparameters of training these GNNs and MLP are shown in Table A2. The five seeds for training each type of models (GCN, Graph SAGE, GAT, GIN, MLP) are {0, 1, 2, 3, 4}, and the node classification performance (accuracy) on the test set of each observed graph and the ground-truth accuracy on the unseen test graphs without labels are shown in Table A3, Table A4, Table A5, Table A6, and Table A7, respectively.