Omni-Granular Ego-Semantic Propagation for Self-Supervised Graph Representation Learning

Authors: Ling Yang, Shenda Hong

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In downstream tasks, OEPG consistently achieves the best performance with a 2% 6% accuracy gain on multiple datasets cross scales and domains. Notably, OEPG also generalizes to quantityand topology-imbalance scenarios.
Researcher Affiliation Academia Ling Yang 1 2 Shenda Hong 1 2 1National Institute of Health Data Science, Peking University, Beijing, China 2Institute of Medical Technology, Health Science Center of Peking University, Beijing, China.
Pseudocode Yes Algorithm 1 Algorithm of OEPG
Open Source Code No The paper does not provide any explicit statement about releasing code or a link to a source code repository.
Open Datasets Yes To adequately validate the effectiveness of our OEPG, we use the mutiple downstream datasets cross the scales (small, medium and large scales) and domains (social, academic and biomedical graphs) including TUDataset (Morris et al., 2020), Wiki CS (Mernyei & Cangea, 2020), Amazon Computers&Amazon Photos (Mc Auley et al., 2015), Coauthor CS&Coauthor Physics (Sinha et al., 2015), Molecule Net (Wu et al., 2018), Citeseer, Cora, Pubmed (Sen et al., 2008), Open Graph Benchmark (OGB) (Hu et al., 2020a).
Dataset Splits Yes Then we finetune and evaluate the model on smaller datasets of the same category using the given training/validation/test split. (3) In semi-supervised learning (You et al., 2020a), for datasets without explicit train/validation/test split, we conduct pre-training process with all graph data at first. Then we finetune and evaluate the model with K folds (You et al., 2021). For datasets with the explicit split, we only pre-train the model with the training split, finetune on the partial training split and evaluate on the validation/test splits.
Hardware Specification No The paper does not provide specific hardware details such as CPU/GPU models, memory, or cloud instance types used for running experiments.
Software Dependencies No The paper mentions using an "Adam optimizer (Kingma & Ba, 2014)" but does not specify other software dependencies like programming languages (e.g., Python), libraries (e.g., PyTorch, TensorFlow), or their version numbers.
Experiment Setup Yes We use the same GNN architectures with default training hyper-parameters as in the SOTA methods under three experiment settings. In the pre-training process, we use an Adam optimizer (Kingma & Ba, 2014) (learning rate: 1 10 3 ) to pre-train the OEPG model... The parameter budget B is set to 4 and we adopt the 4-hierarchies (H = 4) ego-semantic descriptors for each subgraph/graph, with the specified cluster numbers [S1, S2, S3, S4] = [16, 12, 8, 4], which are chosen by grid search considering the computational efficiency.