Neural Structured Prediction for Inductive Node Classification
Authors: Meng Qu, Huiyu Cai, Jian Tang
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on two settings show that our approach outperforms many competitive baselines 1. Experiments on two settings against both GNNs and CRFs prove the effectiveness of our approach. |
| Researcher Affiliation | Academia | Meng Qu 1,2, Huiyu Cai 1,2, Jian Tang1,3,4 1Mila Qu ebec AI Institute 2Universit e de Montr eal 3HEC Montr eal 4Canadian Institute for Advanced Research (CIFAR) |
| Pseudocode | No | The paper does not contain a pseudocode block or algorithm block. |
| Open Source Code | Yes | 1Codes are available at https://github.com/Deep Graph Learning/SPN. |
| Open Datasets | Yes | We use the PPI dataset (Zitnik & Leskovec, 2017; Hamilton et al., 2017), which has 20 training graphs. Besides, we also build a DBLP dataset from the citation network in Tang et al. (2008). ... We construct three datasets from the Cora, Citeseer, and Pubmed datasets used for transductive node classification (Yang et al., 2016). ... The statistics of the datasets used in our experiment are summarized in Tab. 8. For the Cora*, Citeseer*, Pubmed*, and PPI datasets, they are under the MIT license. |
| Dataset Splits | Yes | The training/validation/test graph is formed as the citation graph of papers published before 1999, from 2000 to 2009, after 2010 respectively. For each training/validation/test node of the raw dataset, we treat its ego network 3 as a training/validation/test graph. |
| Hardware Specification | Yes | We run the experiment by using NVIDIA Tesla V100 GPUs with 16GB memory. |
| Software Dependencies | Yes | To facilitate reproducibility, we use the GNN module implementations of Py Torch Geometric (Fey & Lenssen, 2019), and follow the GNN models provided in the examples of the repository, unless otherwise mentioned. Note that most architecture choices are not optimal on the benchmark datasets, but we did not tune them since we only aim to show that our method brings consistent and significant improvement. |
| Experiment Setup | Yes | For GNNs, by default we use the same architectures (e.g., number of neurons, number of layers) as used in the original papers. Adam (Kingma & Ba, 2015) is used for training. For the edge GNN in Eq. (8), we add a hyperparameter γ to control the annealing temperature of the logit g(vs, vt) before the softmax function during belief propagation. Empirically, we find that max-product belief propagation works better than the sum-product variant in most cases, so we use the max-product version by default. By default, we do not run refinement when training SPNs. See Sec. F for details. For node classification, the learning rate of the node GNN τs in GNNs and SPNs is presented in Tab. 9. For edge classification, the learning rate of the edge GNN τst is presented in Tab. 10. For the temperature γ used in the edge GNN τst of SPNs, we report its values in Tab. 11. |