Graph-less Neural Networks: Teaching Old MLPs New Tricks Via Distillation
Authors: Shichang Zhang, Yozen Liu, Yizhou Sun, Neil Shah
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Under a production setting involving both transductive and inductive predictions across 7 datasets, GLNN accuracies improve over stand-alone MLPs by 12.36% on average and match GNNs on 6/7 datasets. Comprehensive analysis shows when and why GLNNs can achieve competitive accuracies to GNNs and suggests GLNN as a handy choice for latency-constrained applications. and Evaluation Protocol. For all experiments in this section, we report the average and standard deviation over ten runs with different random seeds. Model performance is measured as accuracy, and results are reported on test data with the best model selected using validation data. |
| Researcher Affiliation | Collaboration | Shichang Zhang University of California, Los Angeles shichang@cs.ucla.edu Yozen Liu Snap Inc. yliu2@snap.com Yizhou Sun University of California, Los Angeles yzsun@cs.ucla.edu Neil Shah Snap Inc. nshah@snap.com |
| Pseudocode | No | The paper describes the GLNN framework conceptually and mathematically (Equation 1), but does not provide pseudocode or an algorithm block. |
| Open Source Code | Yes | Code available at https://github.com/ snap-research/graphless-neural-networks |
| Open Datasets | Yes | Datasets. We consider all five datasets used in the CPF paper (Yang et al., 2021a), i.e. Cora, Citeseer, Pubmed, A-computer, and A-photo. To fully evaluate our method, we also include two more larger OGB datasets (Hu et al., 2020), i.e. Arxiv and Products. |
| Dataset Splits | Yes | Evaluation Protocol. For all experiments in this section, we report the average and standard deviation over ten runs with different random seeds. Model performance is measured as accuracy, and results are reported on test data with the best model selected using validation data. We also evaluate on V U obs containing the other 80% of the test data... For all datasets, we follow the setting in the original paper to split the data...For the OGB datasets, we follow the OGB official splits based on time and popularity for Arxiv and Products respectively. |
| Hardware Specification | Yes | We run all experiments on a machine with 80 Intel(R) Xeon(R) E5-2698 v4 @ 2.20GHz CPUs, and a single NVIDIA V100 GPU with 16GB RAM. |
| Software Dependencies | No | The experiments on both baselines and our approach are implemented using Py Torch, the DGL (Wang et al., 2019) library for GNN algorithms, and Adam (Kingma & Ba, 2015) for optimization. While it lists software, it does not provide specific version numbers for PyTorch, DGL, or Adam. |
| Experiment Setup | Yes | The hyperparameters of GNN models on each dataset are taken from the best hyperparameters provided by the CPF paper and the OGB official examples. For the student MLPs and GLNN s, unless otherwise specified with -wi or -Li, we set the number of layers and the hidden dimension of each layer to be the same as the teacher GNN, so their total number of parameters stays the same as the teacher GNN. For GLNN s we do a hyperparameter search of learning rate from [0.01, 0.005, 0.001], weight decay from [0, 0.001, 0.002, 0.005, 0.01], and dropout from [0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6] |