reproducibilityindex.ai

Improving Attention Mechanism in Graph Neural Networks via Cardinality Preservation

Authors: Shuo Zhang, Lei Xie

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments on node and graph classiﬁcation conﬁrm our theoretical analysis and show the competitive performance of our CPA models.
Researcher Affiliation	Academia	Shuo Zhang1 , Lei Xie1,2,3 1Ph.D. Program in Computer Science, The Graduate Center, The City University of New York 2Department of Computer Science, Hunter College, The City University of New York 3Helen & Robert Appel Alzheimer s Disease Research Institute, Feil Family Brain & Mind Research Institute, Weill Cornell Medicine, Cornell University szhang4@gradcenter.cuny.edu, lei.xie@hunter.cuny.edu
Pseudocode	No	The paper describes models using mathematical equations but does not provide pseudocode or a clearly labeled algorithm block.
Open Source Code	Yes	The code is available online: https://github.com/zetayue/CPA.
Open Datasets	Yes	In our experiment on graph classiﬁcation, we use 6 benchmark datasets collected by [Kersting et al., 2020]: 2 social network datasets (REDDIT-BINARY (RE-B), REDDITMULTI5K (RE-M5K)) and 4 bioinformatics datasets (MUTAG, PROTEINS, ENZYMES, NCI1).
Dataset Splits	Yes	For all experiments, we perform 10-fold cross-validation and repeat the experiments 10 times for each dataset and each model. Following [Xu et al., 2019], to get a ﬁnal accuracy for each run, we select the epoch with the best cross-validation accuracy averaged over all 10 folds.
Hardware Specification	No	The paper does not explicitly describe the hardware used to run its experiments.
Software Dependencies	No	The paper mentions using Adam optimizer [Kingma and Ba, 2018], but does not provide specific version numbers for software dependencies.
Experiment Setup	Yes	For node classiﬁcation, we use GAT [Veliˇckovi c et al., 2018] as the Original model. In the GAT variants, we use 2 GNN layers and a hidden dimensionality of 32. The negative input slope of Leaky Re LU in the GAT attention mechanism is 0.2. The number of heads in multi-head attention is 1. We use a dropout ratio of 0 and a weight decay value of 0. For graph classiﬁcation, we build a GNN (GAT-GC) based on GAT as the Original model: We adopt the attention mechanism in GAT to specify the form of Equation (3). For the readout function, a naive way is to only consider the node embeddings from the last iteration. Although a sufﬁcient number of iterations can help to avoid the cases in Theorem 1 by aggregating more diverse node features, the features from the latter iterations may generalize worse and the GNNs usually have shallow structures [Xu et al., 2019; Zhou et al., 2018]. So the GAT-GC adopts the same function as used in [Xu et al., 2018; Xu et al., 2019; Li et al., 2019], which concatenates graph embeddings from all iterations: h G = L k=0 Readout( hk i i G ) . For the Readout function, we use sum for bioinformatics datasets and mean for social network datasets. In the GAT-GC variants, we use 4 GNN layers. The hidden dimensionality is 32 for bioinformatics datasets and 64 for social network datasets. The negative input slope of Leaky Re LU is 0.2. We use a single head in the multi-head attention. The following hyper-parameters are tuned for each dataset: (1) Batch size in {32, 128}; (2) Dropout ratio in {0, 0.5} after dense layer; (3) L2 regularization from 0 to 0.001.