NAFS: A Simple yet Tough-to-beat Baseline for Graph Representation Learning
Authors: Wentao Zhang, Zeang Sheng, Mingyu Yang, Yang Li, Yu Shen, Zhi Yang, Bin Cui
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments on four benchmark datasets on two different application scenarios: node clustering and link prediction. Remarkably, NAFS with feature ensemble outperforms the state-of-the-art GNNs on these tasks and mitigates the aforementioned two limitations of most learning-based GNN counterparts. |
| Researcher Affiliation | Academia | 1School of CS & Key Laboratory of High Confidence Software Technologies, Peking University 2Institute of Computational Social Science, Peking University (Qingdao), China. |
| Pseudocode | Yes | Alg. 1 shows the whole pipeline of our proposed NAFS. |
| Open Source Code | Yes | The source code of NAFS can be found in Anonymous Github (https://github.com/PKU-DAIR/NAFS). |
| Open Datasets | Yes | Several widely-used network datasets (i.e., Cora, Citeseer, Pub Med, Wiki and ogbn-products) are used in our experiments. We include the properties of these datasets in Appendix B.1. [...] Cora, Citeseer, Pub Med, and Wiki are four popular network datasets. The first three (Yang et al., 2016) are citation networks... Wiki (Yang et al., 2015) is a webpage network... The ogbn-arxiv (Hu et al., 2020) dataset is also a citation network... Besides, the ogbn-products dataset is an undirected and unweighted graph... |
| Dataset Splits | Yes | For the link prediction task, 5% and 10% edges are randomly reserved for the validation set and the test set. |
| Hardware Specification | Yes | The experiments are conducted on a machine with Intel(R) Xeon(R) Gold 5120 CPU @ 2.20GHz, and a single NVIDIA TITAN RTX GPU with 24GB GPU memory. |
| Software Dependencies | Yes | For software versions, we use Python 3.6, Pytorch 1.7.1, and CUDA 10.1. |
| Experiment Setup | Yes | We run the compared baselines with 200 epochs and repeat the experiment 10 times on all the datasets, and report the mean value of each evaluation metric. The detailed setting of the hyperparameters and experimental environment are introduced in Appendix B.2 and B.3. [...] The optimal value of the maximal smoothing steps ranges from 1 to 70. Hyperparameters for all the baseline methods are tuned with Open Box (Li et al., 2021) or following the settings in their original paper. |