Graph Classification via Reference Distribution Learning: Theory and Practice

Authors: Zixiao Wang, Jicong Fan

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on moderate-scale and large-scale graph datasets show the superiority of GRDL over the state-of-the-art, emphasizing its remarkable efficiency, being at least 10 times faster than leading competitors in both training and inference stages.
Researcher Affiliation Academia Zixiao Wang Jicong Fan School of Data Science The Chinese University of Hong Kong, Shenzhen zixiaowang@link.cuhk.edu.cn fanjicong@cuhk.edu.cn
Pseudocode Yes C Detailed Training Algorithm of GRDL" and "Algorithm 1 GRDL Training
Open Source Code Yes The source code of GRDL is available at https://github.com/jicongfan/GRDL-Graph-Classification.
Open Datasets Yes We leverage eight popular graph classification benchmarks [Morris et al., 2020], comprising five bioinformatics datasets (MUTAG, PROTEINS, NCI1, PTC-MR, BZR) and three social network datasets (IMDB-B, IMDB-M, COLLAB). We also use three large-scale imbalanced datasets (PC-3, MCF-7, and ogbg-molhiv [Hu et al., 2020]).
Dataset Splits Yes We quantify the generalization capacities of models by performing a 10-fold cross-validation with a holdout test set which is never seen during training. The validation accuracy is tracked every 5 epochs, and the model that maximizes the validation accuracy is retained for testing.
Hardware Specification Yes Experiments were conducted on CPUs (Apple M1) using identical batch sizes, ensuring a fair comparison." and "All experiments are conducted on one RTX3080.
Software Dependencies No The paper mentions the Adam optimizer but does not provide specific version numbers for any software dependencies like Python or deep learning frameworks.
Experiment Setup Yes For our methods, we use GIN [Xu et al., 2018] layers as the embedding network. Every GIN layer is an MLP of 2 layers (r = 2) with batch normalization, whose number of units is validated in {32, 64} for all datasets. The parameter λ is validated in {0.1, 1}. We validate the number of GIN layers (L) in {3, 4, 5, 6, 7, 8, 9}." and "The models are trained with Adam optimizer with an initial learning rate α1 = 10 3 for network weights and references. The learning rate α1 decays exponentially with a factor 0.95... The batch size for all datasets is fixed to 32.