A Practical, Progressively-Expressive GNN
Authors: Lingxiao Zhao, Neil Shah, Leman Akoglu
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimentally, our (k, c)( )-SETGNN outperforms existing state-of-the-art GNNs on simulated expressiveness as well as real-world graph-level tasks, achieving new bests on substructure counting and ZINC-12K, respectively. We show that generalization performance reflects increasing expressiveness by k and c. |
| Researcher Affiliation | Collaboration | Lingxiao Zhao Carnegie Mellon University Louis Härtel RWTH Aachen University Snap Inc. nshah@snap.com Leman Akoglu Carnegie Mellon University |
| Pseudocode | Yes | Fortunately, Sk,c-swl(G) can be efficiently and recursively constructed based on S(k 1,c)-swl(G), and we include the construction algorithm in the Appendix A.5. |
| Open Source Code | Yes | We open source our implementation at https://github.com/Lingxiao Shawn/KCSet GNN. |
| Open Datasets | Yes | We also evaluate performance on two real world graph learning tasks: 5) ZINC-12K [19], and 6) QM9 [59] for molecular property regression. |
| Dataset Splits | Yes | For ZINC-12K, we follow the common practice [19] and use 10,000 graphs for training, 1,000 graphs for validation, and 1,000 graphs for testing. For QM9, we use 110,000 graphs for training, 10,000 graphs for validation, and 10,000 graphs for testing. For the simulation datasets, we randomly split 80% of data for training and 20% for testing. |
| Hardware Specification | Yes | All experiments are conducted on 8 Nvidia A100 GPUs with 40 GB memory each. |
| Software Dependencies | No | Our code is based on PyTorch Geometric [20]. While it mentions the library, it does not specify a version number for PyTorch Geometric or any other software components. |
| Experiment Setup | Yes | Hyperparameter and model configurations are described in Appendix A.8. Appendix A.8: We use a 4-layer (k,c)-SETGNN with 128 hidden dimensions for all experiments. We use Adam optimizer with a learning rate of 0.001, batch size of 32 for ZINC-12K, and 128 for QM9, and train for 300 epochs. Dropout is applied after each layer with a rate of 0.5. For Base GNN, we use a 4-layer GIN with 128 hidden dimensions. For ZINC-12K, we follow the common practice [19] and use 10,000 graphs for training, 1,000 graphs for validation, and 1,000 graphs for testing. For QM9, we use 110,000 graphs for training, 10,000 graphs for validation, and 10,000 graphs for testing. For the simulation datasets, we randomly split 80% of data for training and 20% for testing. We train for 100 epochs, with batch size 64. |