Empowering Dual-Level Graph Self-Supervised Pretraining with Motif Discovery
Authors: Pengwei Yan, Kaisong Song, Zhuoren Jiang, Yangyang Kang, Tianqianjin Lin, Changlong Sun, Xiaozhong Liu
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on 15 datasets validate DGPM s effectiveness and generalizability, outperforming state-of-the-art methods in unsupervised representation learning and transfer learning settings. The autonomously discovered motifs demonstrate the potential of DGPM to enhance robustness and interpretability. |
| Researcher Affiliation | Collaboration | 1Department of Information Resources Management, Zhejiang University, Hangzhou, 310058, China 2Alibaba Group, Hangzhou, 311121, China 3Northeastern University, Shenyang, 110819, China 4Computer Science Department, Worcester Polytechnic Institute, Worcester, 01609-2280, MA, USA |
| Pseudocode | No | The paper describes its methodology and components using text and equations (e.g., in the 'Methodology' section), but it does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | 1The code is available at https://github.com/RocccYan/DGPM. |
| Open Datasets | Yes | To validate unsupervised representation learning, we conducted experiments on 7 graph classification benchmarks (Hou et al. 2022) from four distinct domains: MUTAG, IMDB-B, IMDB-M, PROTEINS, COLLAB, REDDIT-B, and NCI1. ... 250k unlabeled molecules sampled from the ZINC15 (Sterling and Irwin 2015) are used for pretraining and 8 molecular benchmark datasets (Wu et al. 2018) are used for finetuning and testing: BBBP, Tox21, Tox Cast, SIDER, Clin Tox, MUV, HIV, and BACE. |
| Dataset Splits | No | We followed the experimental setup employed in previous research work, such as data splits and evaluation metrics. ... for unsupervised representation learning task, we adopted the experimental setup from (Zhang et al. 2021a; Hou et al. 2022); for transfer learning task, we followed the setup established in (Hu et al. 2019; You et al. 2020, 2021). ... The downstream datasets are partitioned using scaffold-split to emulate real-world scenarios. ... We report the mean 10-fold crossvalidation accuracy with standard deviation after 5 runs. |
| Hardware Specification | No | The paper does not provide any specific details regarding the hardware used for running the experiments (e.g., CPU or GPU models, memory, or cloud computing resources). |
| Software Dependencies | No | The paper states 'all implementations carried out using the Py Torch Geometric package' but does not specify version numbers for PyTorch Geometric or any other software dependencies. |
| Experiment Setup | Yes | The hidden dimension is set to 128 for both node and motif representations. The framework is trained using the Adam W optimizer for 100 epochs, with all implementations carried out using the Py Torch Geometric package. |