reproducibilityindex.ai

Understanding Attention and Generalization in Graph Neural Networks

Authors: Boris Knyazev, Graham W. Taylor, Mohamed Amer

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We design simple graph reasoning tasks that allow us to study attention in a controlled environment. We ﬁnd that under typical conditions the effect of attention is negligible or even harmful, but under certain conditions it provides an exceptional gain in performance... We validate the effectiveness of this scheme on our synthetic datasets, as well as on MNIST and on real graph classiﬁcation benchmarks...
Researcher Affiliation	Collaboration	Boris Knyazev University of Guelph Vector Institute bknyazev@uoguelph.ca Graham W. Taylor University of Guelph Vector Institute, Canada CIFAR AI Chair gwtaylor@uoguelph.ca Mohamed R. Amer Robust.AI mohamed@robust.ai
Pseudocode	No	No structured pseudocode or algorithm blocks were found in the paper.
Open Source Code	Yes	Source code and datasets are available at https://github. com/bknyaz/graph_attention_pool.
Open Datasets	Yes	We also experiment with MNIST images [13] and three molecule and social datasets... namely COLLAB [14, 15], PROTEINS [16], and D&D [17].
Dataset Splits	Yes	For synthetic datasets, we tune them on a validation set generated in the same way as TEST-ORIG. For MNIST-75SP, we use part of the training set. For COLLAB, PROTEINS and D&D, we tune them using 10-fold cross-validation on the training set.
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory) used for running experiments were mentioned in the paper.
Software Dependencies	No	We train all models with Adam [26], learning rate 1e-3, batch size 32, weight decay 1e-4 (see the Supp. Material for details).
Experiment Setup	Yes	We build 2 layer GNNs for COLORS and 3 layer GNNs for other tasks with 64 ﬁlters in each layer, except for MNIST-75SP where we have more ﬁlters. ... We train all models with Adam [26], learning rate 1e-3, batch size 32, weight decay 1e-4 (see the Supp. Material for details).