Graph Classification via Reference Distribution Learning: Theory and Practice
Authors: Zixiao Wang, Jicong Fan
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on moderate-scale and large-scale graph datasets show the superiority of GRDL over the state-of-the-art, emphasizing its remarkable efficiency, being at least 10 times faster than leading competitors in both training and inference stages. |
| Researcher Affiliation | Academia | Zixiao Wang Jicong Fan School of Data Science The Chinese University of Hong Kong, Shenzhen zixiaowang@link.cuhk.edu.cn fanjicong@cuhk.edu.cn |
| Pseudocode | Yes | C Detailed Training Algorithm of GRDL" and "Algorithm 1 GRDL Training |
| Open Source Code | Yes | The source code of GRDL is available at https://github.com/jicongfan/GRDL-Graph-Classification. |
| Open Datasets | Yes | We leverage eight popular graph classification benchmarks [Morris et al., 2020], comprising five bioinformatics datasets (MUTAG, PROTEINS, NCI1, PTC-MR, BZR) and three social network datasets (IMDB-B, IMDB-M, COLLAB). We also use three large-scale imbalanced datasets (PC-3, MCF-7, and ogbg-molhiv [Hu et al., 2020]). |
| Dataset Splits | Yes | We quantify the generalization capacities of models by performing a 10-fold cross-validation with a holdout test set which is never seen during training. The validation accuracy is tracked every 5 epochs, and the model that maximizes the validation accuracy is retained for testing. |
| Hardware Specification | Yes | Experiments were conducted on CPUs (Apple M1) using identical batch sizes, ensuring a fair comparison." and "All experiments are conducted on one RTX3080. |
| Software Dependencies | No | The paper mentions the Adam optimizer but does not provide specific version numbers for any software dependencies like Python or deep learning frameworks. |
| Experiment Setup | Yes | For our methods, we use GIN [Xu et al., 2018] layers as the embedding network. Every GIN layer is an MLP of 2 layers (r = 2) with batch normalization, whose number of units is validated in {32, 64} for all datasets. The parameter λ is validated in {0.1, 1}. We validate the number of GIN layers (L) in {3, 4, 5, 6, 7, 8, 9}." and "The models are trained with Adam optimizer with an initial learning rate α1 = 10 3 for network weights and references. The learning rate α1 decays exponentially with a factor 0.95... The batch size for all datasets is fixed to 32. |