Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Provable Training for Graph Contrastive Learning
Authors: Yue Yu, Xiao Wang, Mengmei Zhang, Nian Liu, Chuan Shi
NeurIPS 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through extensive experiments on various benchmarks, POT consistently improves the existing GCL approaches, serving as a friendly plugin. |
| Researcher Affiliation | Academia | Yue Yu1, Xiao Wang2 , Mengmei Zhang1, Nian Liu1, Chuan Shi1 1Beijing University of Posts and Telecommunications, China 2 Beihang University, China |
| Pseudocode | Yes | Algorithm 1: Provable Training for GCL |
| Open Source Code | Yes | The complete implementation can be found at https://github.com/Void Haruhi/POT-GCL. We also provide an implementation based on Gamma GL [12] at https://github.com/BUPT-GAMMA/Gamma GL. |
| Open Datasets | Yes | We obtain the datasets from Py G [3]. Although the datasets are available for public use, we cannot find their licenses. The datasets can be found in the URLs below: Cora, Cite Seer, Pub Med: https://github.com/kimiyoung/planetoid/raw/master/data Blog Catalog: https://docs.google.com/uc?export=download&id=178PqGqh67RUYMMP6SoRHDoIBh8ku5FS&confirm=t Flickr: https://docs.google.com/uc?export=download&id=1tZp3EB20fAC27SYWwax66_8uGsuU62X&confirm=t Computers, Photo: https://github.com/shchur/gnn-benchmark/raw/master/data/npz/ Wiki CS: https://github.com/pmernyei/wiki-cs-dataset/raw/master/dataset |
| Dataset Splits | Yes | For datasets with a public split available [28], including Cora, Cite Seer, and Pub Med, we follow the public split; For other datasets with no public split, we generate random splits, where each of the training set and validation set contains 10% nodes of the graph and the rest 80% nodes of the graph is used for testing. |
| Hardware Specification | Yes | OS: Linux 5.4.0-131-generic CPU: Intel(R) Xeon(R) Gold 6348 CPU @ 2.60GHz GPU: Ge Force RTX 3090 |
| Software Dependencies | No | The paper mentions implementing with 'Py Torch' and using 'Py G' for datasets, but does not specify version numbers for these software components. |
| Experiment Setup | Yes | Table 4: Hyperparameters: (p1 e, p2 e) Models Cora Cite Seer Pub Med Flickr Blog Catalog Computers Photo Wiki CS... Table 5: Hyperparameters: (Ï, Îș) Models Cora Cite Seer Pub Med Flickr Blog Catalog Computers Photo Wiki CS... Table 6: Hyperparameters: (pot_batch, num_epochs) Models Cora Cite Seer Pub Med Flickr Blog Catalog Computers Photo Wiki CS |