Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

FASTRAIN-GNN: Fast and Accurate Self-Training for Graph Neural Networks

Authors: Amrit Nagarajan, Anand Raghunathan

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental On few-shot node classification tasks using different GNN architectures, FASTRAIN-GNN produces models that are consistently more accurate (by up to 4.4%), while also substantially reducing the self-training time (by up to 2.1 ) over the current state-of-the-art methods. ... 4 Experiments and Results ... Table 2: Results of training GCN with different label rates.
Researcher Affiliation Academia Amrit Nagarajan EMAIL School of Electrical and Computer Engineering Purdue University Anand Raghunathan EMAIL School of Electrical and Computer Engineering Purdue University
Pseudocode Yes Algorithm 1: Sampling-based Pseudolabel Filtering (SPF) ... Algorithm 2: Dynamic Regularization (DR) and Dynamic Sizing (DS) ... Algorithm 3: Progressive Graph Pruning (PGP)
Open Source Code Yes Code is available at https://github.com/amrnag/FASTRAIN-GNN.
Open Datasets Yes The datasets used for testing are summarized in Table 1. [Listing Cora, Citeseer, Pubmed, Cora Full]. ... We present results on the Chameleon, Texas, Wisconsin and Cornell datasets with different label rates in Table 12.
Dataset Splits Yes We randomly select (labels/class) nodes of each class as training nodes, and report results on the rest of the nodes in the graph (we do not require a separate held-out validation set for any of the FASTRAIN-GNN optimizations). We repeat this process 100 times for each value of (labels/class), and all results reported in this section are averaged across 100 different training splits (with error bars indicating accuracy range), unless otherwise specified.
Hardware Specification Yes We implement FASTRAIN-GNN using DGL in Py Torch, and evaluate it on a Ge Force RTX 2080 Ti GPU with 11GB memory.
Software Dependencies No We implement FASTRAIN-GNN using DGL in Py Torch. The paper mentions DGL and Py Torch but does not provide specific version numbers for these software components.
Experiment Setup Yes The details of the hyperparameters used in all experiments are presented in Appendix D. ... Table 8: Hyperparameters used in our experiments. The exact same hyperparameters are used in all our experiments spanning different datasets, GNN architectures and label rates. ... 4 stages of self-training are performed, with 500 epochs of training in each stage in all methods.