reproducibilityindex.ai

SPAGAN: Shortest Path Graph Attention Network

Authors: Yiding Yang, Xinchao Wang, Mingli Song, Junsong Yuan, Dacheng Tao

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We test SPAGAN on the downstream classiﬁcation task on several standard datasets, and achieve performances superior to the state of the art.
Researcher Affiliation	Academia	1Department of Computer Science, Stevens Institute of Technology 2College of Computer Science and Technology, Zhejiang University 3Department of Computer Science and Engineering, State University of New York at Buffalo 4UBTECH Sydney Artiﬁcal Intelligence Centre, University of Sydney {yyang99, xwang135}@stevens.edu, brooksong@zju.edu.cn, jsyuan@buffalo.edu, dacheng.tao@sydney.edu.au.
Pseudocode	No	The paper describes the proposed method step-by-step in prose, but it does not include a structured pseudocode or algorithm block.
Open Source Code	No	The paper does not provide any statement about releasing source code or a link to a code repository for their proposed method.
Open Datasets	Yes	We use three widely used semi-supervised graph datasets, Cora, Citeseer and Pubmed summarized in Tab. 1.
Dataset Splits	Yes	Following the work of [Kipf and Welling, 2016; Veliˇckovi c et al., 2018], for each dataset, we only use 20 nodes per class for training, 500 nodes for validating and 1000 nodes for testing.
Hardware Specification	Yes	The running time of one epoch with path attention on Pubmed dataset is 0.1s on a Nvidia 1080Ti GPU.
Software Dependencies	No	We implement SPAGAN under Pytorch framework [Paszke et al., 2017] and train it with Adam optimizer. While specific frameworks/optimizers are mentioned, no specific version numbers for PyTorch or other libraries are provided.
Experiment Setup	Yes	For the Cora dataset, we set the learning rate to 0.005 and the weight of L2 regularization to 0.0005; for the Pubmed dataset, we set the learning rate to 0.01 and the weight of L2 regularization to 0.001. For the Citeseer dataset... we set the learning rate to 0.0085 and the weight of L2 regularization to 0.002. For all datasets, we set a tolerance window and stop the training process if there is no lower validation loss within it. We use two graph convolutional layers for all datasets with different attention heads. For the ﬁrst layer, 8 attention heads for each c is used. Each attention head will compute 8 features. Then, an ELU [Clevert et al., 2015] function is applied. In the second layer, we use 8 attention heads for the Pumbed dataset and 1 attention head for the other two datasets. Dropout is applied to the input of each layer and also to the attention coefﬁcients for each node, with a keep probability of 0.4. For all datasets, we set the r to 1.0... The max value of c is set to be three for the ﬁrst layer and two for the last layer for all datasets. The steps of iteration is set to two. For all the datasets, we use early stopping based on the cross-entropy loss on validation set.