Adaptive Structural Fingerprints for Graph Attention Networks
Authors: Kai Zhang, Yaokang Zhu, Jun Wang, Jie Zhang
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results demonstrate the power of our approach in exploiting rich structural information in GAT and in alleviating the intrinsic oversmoothing problem in graph neural networks. In this section, we report experimental results of the proposed method and state-of-the-art algorithms using graph-based benchmark data sets and transductive classification problem. |
| Researcher Affiliation | Academia | Kai Zhang Department of Computer & Information Sciences Temple University Philadelphia PA 19122, USA kzhang980@gmail.com Yaokang Zhu & Jun Wang School of Computer Science and Technology East Chine Normal University, Shanghai China 52184501026@stu.ecnu.edu.cn jwang@sei.ecnu.edu.cn Jie Zhang Institute of Brain-Inspired Intelligence Fudan University, Shanghai China jzhang080@gmail.com |
| Pseudocode | No | The paper describes the algorithm steps (Step 1, Step 2, Step 3, Step 4) and provides a workflow diagram (Figure 4) but does not include a formally labeled pseudocode or algorithm block. |
| Open Source Code | Yes | Our codes can be downloaded from the anonymous Github link http://github.com/Avigdor Z. |
| Open Datasets | Yes | We have selected three benchmark graph-structured data set from (Sen et al., 2008), namely Cora, Citeseer, and Pubmed. The three data sets are all citation networks. We split the data set into three parts: training, validation, and testing, as shown in table 1. |
| Dataset Splits | Yes | We split the data set into three parts: training, validation, and testing, as shown in table 1. Algorithm performance will be evaluated on the classification precision on the test split. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | Adam SGD is used for optimization, with learning rate λ = 5 1e 4. The paper does not mention specific software libraries or their version numbers, only the optimization algorithm. |
| Experiment Setup | Yes | Altogether two layers of message passing are adopted. In the first layer, one transformation matrix W Rd 8 is learned for each of altogether 8 attention heads; in the second layer, a transformation matrix W R64 C is used on the concatenated features (from the 8 attention head from the first layer), and one attention head is adopted followed by a softmax operator, where C is the number of classes. The number of parameters is 64(d + C). For the Pubmed data set, 8 attention heads are used in the second layer due to the larger graph size. Adam SGD is used for optimization, with learning rate λ = 5 1e 4. Both the fingerprint size and the attention range is chosen as 2-hop neighbors in our approach. The restart probability is simply chosen as c = 0.5. |