Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
How to Find Your Friendly Neighborhood: Graph Attention Design with Self-Supervision
Authors: Dongkwan Kim, Alice Oh
ICLR 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiment on 17 real-world datasets demonstrates that our recipe generalizes across 15 datasets of them, and our models designed by recipe show improved performance over baselines. |
| Researcher Affiliation | Academia | Dongkwan Kim & Alice Oh KAIST, Republic of Korea EMAIL, EMAIL |
| Pseudocode | No | The paper describes the model architecture and equations but does not include a clearly labeled pseudocode block or algorithm. |
| Open Source Code | Yes | We make our code available for future research (https://github.com/dongkwan-kim/Super GAT). |
| Open Datasets | Yes | We use a total of 17 real-world datasets (Cora, Cite Seer, Pub Med, Cora-ML, Cora-Full, DBLP, ogbn-arxiv, CS, Physics, Photo, Computers, Wiki-CS, Four-Univ, Chameleon, Crocodile, Flickr, and PPI) in diverse domains... See appendix A.1 for detailed description, splits, statistics (including degree and homophily), and references. We follow the train/validation/test split of previous work (Kipf & Welling, 2017). |
| Dataset Splits | Yes | We follow the train/validation/test split of previous work (Kipf & Welling, 2017). We use 20 samples per class for training, 500 samples for the validation, and 1000 samples for the test. |
| Hardware Specification | Yes | To demonstrate our model s efficiency, we measure the mean wall-clock time of the entire training process of three runs using a single GPU (Ge Force GTX 1080Ti). |
| Software Dependencies | No | The paper states: "All models are implemented in Py Torch (Paszke et al., 2019) and Py Torch Geometric (Fey & Lenssen, 2019)." While it cites the papers, it does not provide specific version numbers for these software components (e.g., PyTorch 1.x) within the text. |
| Experiment Setup | Yes | All parameters are initialized by Glorot initialization (Glorot & Bengio, 2010) and optimized by Adam (Kingma & Ba, 2014). We apply L2 regularization, dropout (Srivastava et al., 2014) to features and attention coefficients, and early stopping on validation loss and accuracy. We use ELU (Clevert et al., 2016) as a non-linear activation ρ. Unless specified, we employ a two-layer Super GAT with F = 8 features and K = 8 attention heads (total 64 features). For real-world datasets, we tune two hyperparameters (mixing coefficients λ2 and λE) by Bayesian optimization for the mean performance of 3 random seeds. We choose negative sampling ratio pn from {0.3, 0.5, 0.7, 0.9}, and edge sampling ratio pe from {0.6, 0.8, 1.0}. We fix dropout probability to 0.0 for PPI, 0.2 for ogbn-arxiv, 0.6 for others. We set learning rate to 0.05 (ogbn-arxiv), 0.01 (Pub Med, PPI, Wiki-CS, Photo, Computers, CS, Physics, Crocodile, Cora-Full, DBLP), 0.005 (Cora, Cite Seer, Cora-ML, Chameleon), 0.001 (Four-Univ). |