ImGCL: Revisiting Graph Contrastive Learning on Imbalanced Node Classification
Authors: Liang Zeng, Lanqing Li, Ziqi Gao, Peilin Zhao, Jian Li
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on multiple imbalanced graph datasets and imbalanced settings demonstrate the effectiveness of our proposed framework, which significantly improves the performance of the recent state-of-the-art GCL methods. |
| Researcher Affiliation | Collaboration | 1Institute for Interdisciplinary Information Sciences (IIIS), Tsinghua University 2Tencent AI Lab 3Hong Kong University of Science and Technology |
| Pseudocode | Yes | Algorithm 1: The Im GCL pre-training algorithm |
| Open Source Code | No | The paper mentions using 'Py GCL (Zhu et al. 2021a) open-source library' for baselines but does not provide a link or explicit statement about the availability of the source code for their proposed Im GCL method. |
| Open Datasets | Yes | Dataset. We use four widely-used datasets including Wiki-CS, Amazon-computers, Amazon-photo, and DBLP, to comprehensively study the performance of transductive node classification. |
| Dataset Splits | Yes | Following (Zhu et al. 2021b), the training set is randomly sampled from the rest according to train/valid/test ratios = 1:1:8, which is highly imbalanced. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory specifications) used for running the experiments. |
| Software Dependencies | No | The paper mentions using 'Py GCL (Zhu et al. 2021a) open-source library' but does not provide specific version numbers for any software components, which is required for reproducibility. |
| Experiment Setup | Yes | In Im GCL, we set the number of clusters K in the node centrality based PBS method equal to the number of classes in the downstream task. ... we re-balance the class distribution every B epochs... We select N l nodes during the pre-training phase in Im GCL, where l = 10% equals the ratio of training data in the down-stream task. |