Boosting Graph Structure Learning with Dummy Nodes
Authors: Xin Liu, Jiayang Cheng, Yangqiu Song, Xin Jiang
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments are conducted on graph classification and subgraph isomorphism counting and matching, and empirical results reveal the success of learning with graphs with dummy nodes. |
| Researcher Affiliation | Collaboration | 1Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong SAR, China 2Huawei Noah s Ark Lab, Hong Kong SAR, China. |
| Pseudocode | Yes | Algorithm 1 Edge-to-vertex transform LΦ |
| Open Source Code | Yes | Code is publicly released at https://github.com/ HKUST-Know Comp/Dummy Node4Graph Learning. |
| Open Datasets | Yes | Datasets. We select four benchmarking datasets where current state-of-the-art models face overfitting problems: PROTEINS (Borgwardt et al., 2005), D&D (Dobson & Doig, 2003), NCI109, and NCI1 (Wale et al., 2008). |
| Dataset Splits | Yes | Following previous work (Zhang et al., 2019), we randomly split each dataset into the training set (80%), the validation set (10%), and the test set (10%) in each run. |
| Hardware Specification | Yes | We conduct our experiments on one Cent OS 7 server with 2 Intel Xeon Gold 5215 CPUs and 4 NVIDIA Ge Force RTX 3090 GPUs. |
| Software Dependencies | Yes | The software versions are: GNU C++ Compiler 5.2.0, Python 3.7.3, Py Torch 1.7.1, torch-geometric 2.0.2, and DGL 0.6.0. |
| Experiment Setup | Yes | We use the Adam optimizer (Kingma & Ba, 2015) to optimize the models. Following Zhang et al. (2019), an early stopping strategy with patience 100 is adopted during training, i.e., training would be stopped when the loss on validation set does not decrease for over 100 epochs. For Graph SAGE, GIN and Diff Pool, the optimal hyper-parameters are found using grid search within the same search ranges in (Errica et al., 2020). For HGP-SL, we follow the official hyper-parameters reported in their Git Hub repository. For models using graph convolutional operator (Kipf & Welling, 2017) (GCN, Diff Pool, HGP-SL), we additionally impose a learnable weight γ on the dummy edges. The weights for all the other edges is set as 1, and γ is initialized with different values in {0.01, 0.1, 1, 10}. For relational models RGCN and RGIN, we search for the learning rate within {1e 2, 1e 3, 1e 4}, batch size {128, 512}, hidden dimension {32, 64}, dropout ratio {0, 0.5}, and number of layers {2, 4}. |