SEGA: Structural Entropy Guided Anchor View for Graph Contrastive Learning

Authors: Junran Wu, Xueyuan Chen, Bowen Shi, Shangzhe Li, Ke Xu

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We extensively validate the proposed anchor view on various benchmarks regarding graph classification under unsupervised, semi-supervised, and transfer learning and achieve significant performance boosts compared to the state-of-the-art methods. Extensive experiments, including unsupervised, semi-supervised, and transfer learning, are conducted on various benchmarks regarding graph classification. Superior performance can be observed in comparison with those state-of-the-art (SOTA) methods. The contributions of this work can be summarized as follows: We extensively evaluate the proposed anchor view on various benchmarks under the setting of unsupervised, semi-supervised, and transfer learning, and obtain superior performance compared to the SOTA methods. In this section, we are devoted to evaluating SEGA with extensive experiments. Table 1. Average accuracies (%) Std. of compared methods via unsupervised learning. Table 2. Average test ROC-AUC (%) Std. over different 10 runs of SEGA along with all baselines on nine downstream benchmarks. Table 3. Average accuracies (%) Std. of compared methods via semi-supervised representation learning with 10% labels. Table 4. Orthogonal experiment results (%) of SEGA with SOTAs in unsupervised representation learning. Here, we make an in-depth analysis about the performance of SEGA under the setting of unsupervised learning.
Researcher Affiliation Academia 1State Key Lab of Software Development Environment, Beihang University, Beijing, 100191, China 2Zhongguancun Laboratory, Beijing 100094, China.
Pseudocode Yes Algorithm 1 Structural uncertainty minimization Input: the given height k > 1, and the candidate graph G = (V, E) Output: a coding tree T that meets the height bar
Open Source Code Yes The code of SEGA is available at https://github.com/ Wu-Junran/SEGA.
Open Datasets Yes For unsupervised and semi-supervised learning, various benchmarks are adopted from TUDataset (Morris et al., 2020), including COLLAB, REDDIT-BINARY, REDDIT-MULTI-5K, IMDB-BINARY, GITHUB, NCI1, MUTAG, PROTEINS and DD. For transfer learning, ZINC15 (Sterling & Irwin, 2015) dataset is adopted for biochemical pre-training. We employ the eight ubiquitous benchmarks from the Molecule Net dataset (Wu et al., 2018) as the biochemical downstream experiments.
Dataset Splits Yes For datasets with a public training/validation/test split, pre-training is performed only on training dataset, finetuning is conducted with 10% of the training data, and final evaluation results are from the validation/test sets. For datasets without such splits, all samples are employed for pre-training while finetuning and evaluation are performed over 10 folds. The split for train/validation/test sets is 80%:10%:10%. The effective split ratio for the train/validation/prior/test sets is 69% : 12% : 9.5% : 9.5%.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or memory amounts used for running the experiments. It only refers to general model architectures like GIN and Res GCN.
Software Dependencies No The paper mentions software like "RDKit (Landrum, 2013)" and "Adam optimizer (Kingma & Ba, 2015)" but does not specify exact version numbers for these software packages or any other key libraries. It only cites the papers that introduced them.
Experiment Setup Yes In unsupervised learning, GIN (Xu et al., 2019) with 32 hidden units and 3 layers is set up. In addition, the same data augmentations on graphs with the default augmentation strength 0.2 are adopted. In transfer learning, GIN is used with 5 layers and 300 hidden dimensions. in semi-supervised learning, Res GCN with 128 hidden units and 5 layers is set up for pre-training and finetuning. Hidden dimension is chosen from {32, 64}, and batch size is chosen from {32, 128}. An Adam optimizer (Kingma & Ba, 2015) is employed to minimize the contrastive lose with {0.01, 0.005, 0.001} learning rate. For pre-training, learning rate is tuned in {0.01, 0.001, 0.0001} and epoch number in {20, 40, 60, 80, 100} where grid search is performed. The batch size is set as 256, and all training processes will run 100 epochs.