Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Towards Understanding Parametric Generalized Category Discovery on Graphs
Authors: Bowen Deng, Lele Fu, Jialong Chen, Sheng Huang, Tianchi Liao, Zhang Tao, Chuan Chen
ICML 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results validate our (theoretical) findings and demonstrate SWIRL s effectiveness. ... We validate all (theoretical) analyses on synthetic datasets and demonstrate SWIRL s effectiveness on real-world graphs. |
| Researcher Affiliation | Academia | 1School of Computer Science and Engineering, Sun Yat-Sen University, Guangzhou, China 2School of Systems Science and Engineering, Sun Yat-Sen University, Guangzhou, China 3School of Software Engineering, Sun Yat Sen University, Zhuhai, China. Correspondence to: Chuan Chen <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 The full procedure of SWIRL |
| Open Source Code | No | The paper does not explicitly state that the source code for the methodology described is open-source or provide a link to a repository. It discusses third-party tools and benchmarks but not its own implementation code. |
| Open Datasets | Yes | We created node-level GGCD datasets based on five existing datasets: Cora, Citeseer, Wiki, A-Computers, and A-Photo. For Cora and Citeseer, we used the public splits, while for the other three datasets, The entire node set is stratified into train, validation, and test subsets in a 2:2:6 ratio. |
| Dataset Splits | Yes | The entire node set is stratified into train, validation, and test subsets in a 2:2:6 ratio. |
| Hardware Specification | Yes | The first system runs Ubuntu 22.04 and is equipped with an RTX 4090 GPU (24GB), an Intel i7-12700 CPU, and 64GB of RAM. The second system, which uses Ubuntu 20.04, features an RTX 4090 GPU (24GB), dual Intel Xeon Gold 6240C processors, and 126GB of RAM. |
| Software Dependencies | Yes | Both systems have the same Conda environment, which includes Py Torch 2.5 (Paszke et al., 2017) and Py G 2.5 (Fey & Lenssen, 2019), all built on CUDA 12.1. |
| Experiment Setup | Yes | We train the model for 1000 epochs using the Adam optimizer (Kingma & Ba, 2017) with a learning rate of 0.01. ... The training loss is LSW = (1 α2)(LNCE + β1LSW + α1LER) + α2LCE. We set α2 = 0.35 and α1 = 2, which are commonly used values in many baseline models (Vaze et al., 2022; Wen et al., 2023), and choose β1 = 20. ... Table 5: Hyperparameters of GGCD methods and the corresponding values or search spaces |