NAFS: A Simple yet Tough-to-beat Baseline for Graph Representation Learning

Authors: Wentao Zhang, Zeang Sheng, Mingyu Yang, Yang Li, Yu Shen, Zhi Yang, Bin Cui

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct experiments on four benchmark datasets on two different application scenarios: node clustering and link prediction. Remarkably, NAFS with feature ensemble outperforms the state-of-the-art GNNs on these tasks and mitigates the aforementioned two limitations of most learning-based GNN counterparts.
Researcher Affiliation Academia 1School of CS & Key Laboratory of High Confidence Software Technologies, Peking University 2Institute of Computational Social Science, Peking University (Qingdao), China.
Pseudocode Yes Alg. 1 shows the whole pipeline of our proposed NAFS.
Open Source Code Yes The source code of NAFS can be found in Anonymous Github (https://github.com/PKU-DAIR/NAFS).
Open Datasets Yes Several widely-used network datasets (i.e., Cora, Citeseer, Pub Med, Wiki and ogbn-products) are used in our experiments. We include the properties of these datasets in Appendix B.1. [...] Cora, Citeseer, Pub Med, and Wiki are four popular network datasets. The first three (Yang et al., 2016) are citation networks... Wiki (Yang et al., 2015) is a webpage network... The ogbn-arxiv (Hu et al., 2020) dataset is also a citation network... Besides, the ogbn-products dataset is an undirected and unweighted graph...
Dataset Splits Yes For the link prediction task, 5% and 10% edges are randomly reserved for the validation set and the test set.
Hardware Specification Yes The experiments are conducted on a machine with Intel(R) Xeon(R) Gold 5120 CPU @ 2.20GHz, and a single NVIDIA TITAN RTX GPU with 24GB GPU memory.
Software Dependencies Yes For software versions, we use Python 3.6, Pytorch 1.7.1, and CUDA 10.1.
Experiment Setup Yes We run the compared baselines with 200 epochs and repeat the experiment 10 times on all the datasets, and report the mean value of each evaluation metric. The detailed setting of the hyperparameters and experimental environment are introduced in Appendix B.2 and B.3. [...] The optimal value of the maximal smoothing steps ranges from 1 to 70. Hyperparameters for all the baseline methods are tuned with Open Box (Li et al., 2021) or following the settings in their original paper.