reproducibilityindex.ai

Omni-Granular Ego-Semantic Propagation for Self-Supervised Graph Representation Learning

Authors: Ling Yang, Shenda Hong

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In downstream tasks, OEPG consistently achieves the best performance with a 2% 6% accuracy gain on multiple datasets cross scales and domains. Notably, OEPG also generalizes to quantityand topology-imbalance scenarios.
Researcher Affiliation	Academia	Ling Yang 1 2 Shenda Hong 1 2 1National Institute of Health Data Science, Peking University, Beijing, China 2Institute of Medical Technology, Health Science Center of Peking University, Beijing, China.
Pseudocode	Yes	Algorithm 1 Algorithm of OEPG
Open Source Code	No	The paper does not provide any explicit statement about releasing code or a link to a source code repository.
Open Datasets	Yes	To adequately validate the effectiveness of our OEPG, we use the mutiple downstream datasets cross the scales (small, medium and large scales) and domains (social, academic and biomedical graphs) including TUDataset (Morris et al., 2020), Wiki CS (Mernyei & Cangea, 2020), Amazon Computers&Amazon Photos (Mc Auley et al., 2015), Coauthor CS&Coauthor Physics (Sinha et al., 2015), Molecule Net (Wu et al., 2018), Citeseer, Cora, Pubmed (Sen et al., 2008), Open Graph Benchmark (OGB) (Hu et al., 2020a).
Dataset Splits	Yes	Then we finetune and evaluate the model on smaller datasets of the same category using the given training/validation/test split. (3) In semi-supervised learning (You et al., 2020a), for datasets without explicit train/validation/test split, we conduct pre-training process with all graph data at first. Then we finetune and evaluate the model with K folds (You et al., 2021). For datasets with the explicit split, we only pre-train the model with the training split, finetune on the partial training split and evaluate on the validation/test splits.
Hardware Specification	No	The paper does not provide specific hardware details such as CPU/GPU models, memory, or cloud instance types used for running experiments.
Software Dependencies	No	The paper mentions using an "Adam optimizer (Kingma & Ba, 2014)" but does not specify other software dependencies like programming languages (e.g., Python), libraries (e.g., PyTorch, TensorFlow), or their version numbers.
Experiment Setup	Yes	We use the same GNN architectures with default training hyper-parameters as in the SOTA methods under three experiment settings. In the pre-training process, we use an Adam optimizer (Kingma & Ba, 2014) (learning rate: 1 10 3 ) to pre-train the OEPG model... The parameter budget B is set to 4 and we adopt the 4-hierarchies (H = 4) ego-semantic descriptors for each subgraph/graph, with the specified cluster numbers [S1, S2, S3, S4] = [16, 12, 8, 4], which are chosen by grid search considering the computational efficiency.