MLPInit: Embarrassingly Simple GNN Training Acceleration with MLP Initialization
Authors: Xiaotian Han, Tong Zhao, Yozen Liu, Xia Hu, Neil Shah
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our extensive experiments on multiple large-scale graph datasets with diverse GNN architectures validate that MLPInit can accelerate the training of GNNs (up to 33 speedup on OGB-products) and often improve prediction performance (e.g., up to 7.97% improvement for Graph SAGE across 7 datasets for node classification, and up to 17.81% improvement across 4 datasets for link prediction on metric Hits@10). |
| Researcher Affiliation | Collaboration | Xiaotian Han1 Tong Zhao2 Yozen Liu2 Xia Hu3 Neil Shah2 1Texas A&M University 2Snap Inc. 3Rice University |
| Pseudocode | Yes | We present Py Torch-style pseudo-code of MLPInit in node classification setting in Algorithm 1. |
| Open Source Code | Yes | The code is available at https://github.com/snapresearch/MLPInit-for-GNNs. |
| Open Datasets | Yes | For node classification, we consider Flickr, Yelp, Reddit, Reddit2, A-products, and two OGB datasets (Hu et al., 2020), OGB-ar Xiv and OGB-products as benchmark datasets. |
| Dataset Splits | Yes | We construct the Peer MLP for each GNN. We first train the Peer MLP for 50 epochs and save the best model with the best validation performance. |
| Hardware Specification | Yes | We run our experiments on the machine with one NVIDIA Tesla T4 GPU (16GB memory) and 60GB DDR4 memory to train the models. For A-products and OGB-products datasets, we run the experiments with one NVIDIA A100 GPU (40GB memory). |
| Software Dependencies | Yes | The code is implemented based on Py Torch 1.9.0 (Paszke et al., 2019) and Py Torch Geometric 2.0.4 (Fey & Lenssen, 2019). |
| Experiment Setup | Yes | Table 21: Training configuration for GNNs training in Figures 3, 8 and 9 and Tables 3 and 4. (This table provides details on #Layers, #Hidden, Learning rate, Batch size, Dropout, Weight decay, and Epoch for various models and datasets, serving as a specific experimental setup.) |