reproducibilityindex.ai

How to Find Your Friendly Neighborhood: Graph Attention Design with Self-Supervision

Authors: Dongkwan Kim, Alice Oh

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiment on 17 real-world datasets demonstrates that our recipe generalizes across 15 datasets of them, and our models designed by recipe show improved performance over baselines.
Researcher Affiliation	Academia	Dongkwan Kim & Alice Oh KAIST, Republic of Korea dongkwan.kim@kaist.ac.kr, alice.oh@kaist.edu
Pseudocode	No	The paper describes the model architecture and equations but does not include a clearly labeled pseudocode block or algorithm.
Open Source Code	Yes	We make our code available for future research (https://github.com/dongkwan-kim/Super GAT).
Open Datasets	Yes	We use a total of 17 real-world datasets (Cora, Cite Seer, Pub Med, Cora-ML, Cora-Full, DBLP, ogbn-arxiv, CS, Physics, Photo, Computers, Wiki-CS, Four-Univ, Chameleon, Crocodile, Flickr, and PPI) in diverse domains... See appendix A.1 for detailed description, splits, statistics (including degree and homophily), and references. We follow the train/validation/test split of previous work (Kipf & Welling, 2017).
Dataset Splits	Yes	We follow the train/validation/test split of previous work (Kipf & Welling, 2017). We use 20 samples per class for training, 500 samples for the validation, and 1000 samples for the test.
Hardware Specification	Yes	To demonstrate our model s efﬁciency, we measure the mean wall-clock time of the entire training process of three runs using a single GPU (Ge Force GTX 1080Ti).
Software Dependencies	No	The paper states: "All models are implemented in Py Torch (Paszke et al., 2019) and Py Torch Geometric (Fey & Lenssen, 2019)." While it cites the papers, it does not provide specific version numbers for these software components (e.g., PyTorch 1.x) within the text.
Experiment Setup	Yes	All parameters are initialized by Glorot initialization (Glorot & Bengio, 2010) and optimized by Adam (Kingma & Ba, 2014). We apply L2 regularization, dropout (Srivastava et al., 2014) to features and attention coefﬁcients, and early stopping on validation loss and accuracy. We use ELU (Clevert et al., 2016) as a non-linear activation ρ. Unless speciﬁed, we employ a two-layer Super GAT with F = 8 features and K = 8 attention heads (total 64 features). For real-world datasets, we tune two hyperparameters (mixing coefﬁcients λ2 and λE) by Bayesian optimization for the mean performance of 3 random seeds. We choose negative sampling ratio pn from {0.3, 0.5, 0.7, 0.9}, and edge sampling ratio pe from {0.6, 0.8, 1.0}. We ﬁx dropout probability to 0.0 for PPI, 0.2 for ogbn-arxiv, 0.6 for others. We set learning rate to 0.05 (ogbn-arxiv), 0.01 (Pub Med, PPI, Wiki-CS, Photo, Computers, CS, Physics, Crocodile, Cora-Full, DBLP), 0.005 (Cora, Cite Seer, Cora-ML, Chameleon), 0.001 (Four-Univ).