GDeR: Safeguarding Efficiency, Balancing, and Robustness via Prototypical Graph Pruning
Authors: Guibin Zhang, Haonan Dong, yuchen zhang, Zhixun Li, Dingshuo Chen, Kai Wang, Tianlong Chen, Yuxuan Liang, Dawei Cheng, Kun Wang
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on five datasets across three GNN backbones, demonstrate that GDe R (I) achieves or surpasses the performance of the full dataset with 30% 50% fewer training samples, (II) attains up to a 2.81 lossless training speedup, and (III) outperforms state-of-the-art pruning methods in imbalanced training and noisy training scenarios by 0.3% 4.3% and 3.6% 7.8%, respectively. |
| Researcher Affiliation | Collaboration | Guibin Zhang 1,2, Haonan Dong 1, Yuchen Zhang2, Zhixun Li3, Dingshuo Chen4, Kai Wang5, Tianlong Chen6, Yuxuan Liang7, Dawei Cheng 1,2, Kun Wang 8 1Tongji Univerity, 2Shanghai AI Laboratory, 3CUHK, 4UCAS, 5NUS, 6UNC-Chapel Hill, 7HKUST (Guangzhou) 8NTU |
| Pseudocode | Yes | Algorithm 1: Algorithm workflow of GDe R |
| Open Source Code | Yes | The source code is available at https://github.com/ins1stenc3/GDe R. |
| Open Datasets | Yes | We test GDe R on two widely-used datasets, MUTAG [38] and DHFR [87]; two OGB large-scale datasets, OGBG-MOLHIV and OGBG-MOLPBCA [88]; one large-scale chemical compound dataset ZINC [89]. |
| Dataset Splits | Yes | Following [40], we adopt a 25%/25%/50% train/validation/test random split for the MUTAG and DHFR under imbalanced scenarios and 80%/10%/10% under normal and biased scenarios, both reporting results across 20 data splits. |
| Hardware Specification | Yes | All the experiments are conducted on NVIDIA Tesla V100 (32GB GPU), using Py Torch and Py Torch Geometric framework. |
| Software Dependencies | No | The paper mentions 'Py Torch and Py Torch Geometric framework' but does not specify their version numbers. |
| Experiment Setup | Yes | The hyperparameters in GDe R include the temperature coefficient τ, prototype count K, loss-specific coefficient λ1 and λ2. Practically, we uniformly set K = 2, and tune the other three by grid searching: τ {1e 3, 1e 4, 1e 5}, λ {1e 1, 5e 1},λ {1e 1, 1e 5}. |