VQGraph: Rethinking Graph Representation Space for Bridging GNNs and MLPs
Authors: Ling Yang, Ye Tian, Minkai Xu, Zhongyi Liu, Shenda Hong, Wei Qu, Wentao Zhang, Bin CUI, Muhan Zhang, Jure Leskovec
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments across seven datasets show VQGRAPH can consistently outperform GNNs by 3.90% on average accuracy, while enjoying 828 faster inference speed. Also VQGRAPH outperforms MLPs and SOTA distillation method NOSMOG (Tian et al., 2023b) by 28.05% and 1.39% on average accuracy across datasets, respectively. |
| Researcher Affiliation | Collaboration | 1Peking University 2Ant Group 3Stanford University |
| Pseudocode | No | The paper describes methods in text and uses mathematical equations, but it does not include a clearly labeled "Pseudocode" or "Algorithm" block. |
| Open Source Code | Yes | Our code is available at https://github.com/YangLing0818/VQGraph |
| Open Datasets | Yes | We use five widely used public benchmark datasets (Zhang et al., 2022b; Yang et al., 2021a) (Citeseer, Pubmed, Cora, A-computer, and A-photo), and two large OGB datasets (Hu et al., 2020a) (Arxiv and Products) to evaluate the proposed model. |
| Dataset Splits | Yes | For the tran setting, we train our models on the labeled graph G, along with the corresponding feature matrix XL and label vector Y L, before evaluating their performance on the unlabeled data XU and Y U. Soft labels, soft code assignments are generated for all nodes within the graph (i.e., ysoft v , r GNN v , r MLP v for v V). As for ind, we follow the methodology of prior work (Tian et al., 2023b) in randomly selecting 20% of the data for inductive evaluation. Specifically, we divide the unlabeled nodes VU into two separate yet non-overlapping subsets, observed and inductive (i.e., VU = VU obs VU ind), producing three distinct graphs, G = GL GU obs GU ind, wherein there are no shared nodes. ... we employ a test dataset V U ind, which contains 20% of the test data, and another dataset V U obs, containing the remaining 80% of the test data. |
| Hardware Specification | No | The paper discusses inference time and efficiency but does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers, such as "Python 3.8" or "PyTorch 1.9". |
| Experiment Setup | Yes | Table 12: Hyperparameters of VQGRAPH. This includes details like MLP layers, hidden dim, learning rate, weight decay, dropout, and factors for Lclass distill (α) and Lcode distill (β). |