StructComp: Substituting propagation with Structural Compression in Training Graph Contrastive Learning
Authors: Shengzhong Zhang, Wenjie Yang, Xinyuan Cao, Hongwei Zhang, Zengfeng Huang
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical studies on various datasets show that Struct Comp greatly reduces the time and memory consumption while improving model performance compared to the vanilla GCL models and scalable training methods. |
| Researcher Affiliation | Academia | Shengzhong Zhang Fudan University, Shanghai, China szzhang17@fudan.edu.cn Wenjie Yang Fudan University, Shanghai, China yangwj22@m.fudan.edu.cn Xinyuan Cao Georgia Institute of Technology, Midtown, USA xcao78@gatech.edu Hongwei Zhang Fudan University, Shanghai, China hwzhang22@m.fudan.edu.cn Zengfeng Huang Fudan University, Shanghai, China huangzf@fudan.edu.cn |
| Pseudocode | No | The paper does not contain any explicit pseudocode or algorithm blocks. It describes methods in narrative text and mathematical formulations. |
| Open Source Code | Yes | The remaining hyperparameter settings for each GCL model are list in our code: https://github.com/szzhang17/Struct Comp. |
| Open Datasets | Yes | The results are evaluated on night real-world datasets (Kipf & Welling, 2017; Veliˇckovi c et al., 2018; Zhu et al., 2021b; Hu et al., 2020), Cora, Citeseer, Pubmed, Amazon Computers, Amazon Photo, Ogbn-Arixv, Ogbn-Products and Ogbn-Papers100M. ... More detailed statistics of the night datasets are summarized in the Appendix C. |
| Dataset Splits | No | On small-scale datasets, including Cora, Citeseer, Pubmed, Amazon Photo and Computers, performance is evaluated on random splits. We randomly select 20 labeled nodes per class for training, while the remaining nodes are used for testing. All results on small-scale datasets are averaged over 50 runs, and standard deviations are reported. For Ogbn-Arixv, Ogbn-Products and Ogbn-Papers100M, we use fixed data splits as in previous studies Hu et al. (2020). While training and testing splits are mentioned, an explicit validation split is not described for all datasets. |
| Hardware Specification | Yes | Experiments are conducted on a server with an NVIDIA 3090 GPU (24 GB memory) and an Intel(R) Xeon(R) Silver 4210R CPU @ 2.40GHz. |
| Software Dependencies | No | All the algorithms and models are implemented in Python and Py Torch Geometric. The paper mentions software but does not specify version numbers for PyTorch Geometric. |
| Experiment Setup | Yes | The key hyperparameter of our framework is the number of clusters, which is set to [300, 300, 2000, 1300, 700, 20000, 25000, 5000] on night datasets, respectively. All models are optimized using the Adam optimizer. The hyperparameters for GCL models trained with Struct Comp are basically the same as those used for full graph training of GCL models. We show the main hyperparameters in Table 7 and 8. |