Graph Contrastive Learning with Stable and Scalable Spectral Encoding

Authors: Deyu Bo, Yuan Fang, Yang Liu, Chuan Shi

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on the node- and graph-level datasets show that our method not only learns effective graph representations but also achieves a 2-10x speedup over other spectral-based methods.
Researcher Affiliation Academia Deyu Bo1 , Yuan Fang2, Yang Liu1, Chuan Shi1 1Beijing University of Posts and Telecommunications, China 2Singapore Management University, Singapore
Pseudocode Yes In order to better demonstrate our algorithm, here we provide the pseudo algorithms of Eigen MLP (Figure 4) and Sp2GCL (Figure 5).
Open Source Code Yes Our code is attached in the supplementary material.
Open Datasets Yes Datasets. In the node classification task, we consider using graphs with different scales to evaluate both the effectiveness and scalability of GCL methods. Specifically, for the small graphs (< 50,000), we use Pubmed [14], Wiki-CS [22], and Facebook [23] datasets. For the large graphs (> 50,000), we use Flickr [43], ar Xiv [9], and PPI [7] datasets. Additional statistics are provided in Appendix B. ... OGB-graph: https://github.com/snap-stanford/ogb (MIT license) ZINC-2M: https://github.com/snap-stanford/pretrain-gnns (MIT license)
Dataset Splits Yes For the Facebook dataset, we randomly split the nodes into train/validation/test data with a ratio of 1:1:8. For other datasets, we use the public splits for a fair comparison. ... and the number of training epochs is chosen among {20, 50, 80, 100, 150} using the validation set, as suggested by AD-GCL [27].
Hardware Specification Yes GPU information: Ge Force RTX 3090 (24 GB)
Software Dependencies No The paper mentions 'Linux version: 5.19.0-38-generic' and 'Operating system: Ubuntu 22.04.2', but it does not specify software dependencies like PyTorch, TensorFlow, or scikit-learn with version numbers.
Experiment Setup Yes We use a two-layer GCN as the encoder for all datasets and set the hidden dimension d = 512 for all methods. ... For the mini-batch training, we set the batch size to 1024. ... The learning rate is set to 0.001 and the period to 10 for all datasets, and the number of training epochs is chosen among {20, 50, 80, 100, 150} using the validation set... For all experiments, we use the Adam optimizer.