A Practical, Progressively-Expressive GNN

Authors: Lingxiao Zhao, Neil Shah, Leman Akoglu

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimentally, our (k, c)( )-SETGNN outperforms existing state-of-the-art GNNs on simulated expressiveness as well as real-world graph-level tasks, achieving new bests on substructure counting and ZINC-12K, respectively. We show that generalization performance reflects increasing expressiveness by k and c.
Researcher Affiliation Collaboration Lingxiao Zhao Carnegie Mellon University Louis Härtel RWTH Aachen University Snap Inc. nshah@snap.com Leman Akoglu Carnegie Mellon University
Pseudocode Yes Fortunately, Sk,c-swl(G) can be efficiently and recursively constructed based on S(k 1,c)-swl(G), and we include the construction algorithm in the Appendix A.5.
Open Source Code Yes We open source our implementation at https://github.com/Lingxiao Shawn/KCSet GNN.
Open Datasets Yes We also evaluate performance on two real world graph learning tasks: 5) ZINC-12K [19], and 6) QM9 [59] for molecular property regression.
Dataset Splits Yes For ZINC-12K, we follow the common practice [19] and use 10,000 graphs for training, 1,000 graphs for validation, and 1,000 graphs for testing. For QM9, we use 110,000 graphs for training, 10,000 graphs for validation, and 10,000 graphs for testing. For the simulation datasets, we randomly split 80% of data for training and 20% for testing.
Hardware Specification Yes All experiments are conducted on 8 Nvidia A100 GPUs with 40 GB memory each.
Software Dependencies No Our code is based on PyTorch Geometric [20]. While it mentions the library, it does not specify a version number for PyTorch Geometric or any other software components.
Experiment Setup Yes Hyperparameter and model configurations are described in Appendix A.8. Appendix A.8: We use a 4-layer (k,c)-SETGNN with 128 hidden dimensions for all experiments. We use Adam optimizer with a learning rate of 0.001, batch size of 32 for ZINC-12K, and 128 for QM9, and train for 300 epochs. Dropout is applied after each layer with a rate of 0.5. For Base GNN, we use a 4-layer GIN with 128 hidden dimensions. For ZINC-12K, we follow the common practice [19] and use 10,000 graphs for training, 1,000 graphs for validation, and 1,000 graphs for testing. For QM9, we use 110,000 graphs for training, 10,000 graphs for validation, and 10,000 graphs for testing. For the simulation datasets, we randomly split 80% of data for training and 20% for testing. We train for 100 epochs, with batch size 64.