Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
MLPInit: Embarrassingly Simple GNN Training Acceleration with MLP Initialization
Authors: Xiaotian Han, Tong Zhao, Yozen Liu, Xia Hu, Neil Shah
ICLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our extensive experiments on multiple large-scale graph datasets with diverse GNN architectures validate that MLPInit can accelerate the training of GNNs (up to 33 speedup on OGB-products) and often improve prediction performance (e.g., up to 7.97% improvement for Graph SAGE across 7 datasets for node classification, and up to 17.81% improvement across 4 datasets for link prediction on metric Hits@10). |
| Researcher Affiliation | Collaboration | Xiaotian Han1 Tong Zhao2 Yozen Liu2 Xia Hu3 Neil Shah2 1Texas A&M University 2Snap Inc. 3Rice University |
| Pseudocode | Yes | We present Py Torch-style pseudo-code of MLPInit in node classification setting in Algorithm 1. |
| Open Source Code | Yes | The code is available at https://github.com/snapresearch/MLPInit-for-GNNs. |
| Open Datasets | Yes | For node classification, we consider Flickr, Yelp, Reddit, Reddit2, A-products, and two OGB datasets (Hu et al., 2020), OGB-ar Xiv and OGB-products as benchmark datasets. |
| Dataset Splits | Yes | We construct the Peer MLP for each GNN. We first train the Peer MLP for 50 epochs and save the best model with the best validation performance. |
| Hardware Specification | Yes | We run our experiments on the machine with one NVIDIA Tesla T4 GPU (16GB memory) and 60GB DDR4 memory to train the models. For A-products and OGB-products datasets, we run the experiments with one NVIDIA A100 GPU (40GB memory). |
| Software Dependencies | Yes | The code is implemented based on Py Torch 1.9.0 (Paszke et al., 2019) and Py Torch Geometric 2.0.4 (Fey & Lenssen, 2019). |
| Experiment Setup | Yes | Table 21: Training configuration for GNNs training in Figures 3, 8 and 9 and Tables 3 and 4. (This table provides details on #Layers, #Hidden, Learning rate, Batch size, Dropout, Weight decay, and Epoch for various models and datasets, serving as a specific experimental setup.) |