Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Cluster-wise Graph Transformer with Dual-granularity Kernelized Attention
Authors: Siyuan Huang, Yunchong Song, Jiayue Zhou, Zhouhan Lin
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The resulting architecture, Cluster-wise Graph Transformer (Cluster-GT), which uses node clusters as tokens and employs our proposed N2C-Attn module, shows superior performance on various graph-level tasks. Code is available at https: //github.com/LUMIA-Group/Cluster-wise-Graph-Transformer. To evaluate the performance of Cluster-GT, we compare it against two categories of methods: Graph Pooling and Graph Transformers. We conduct experiments on eight graph classification datasets from different domains, including social networks and biology. |
| Researcher Affiliation | Academia | Siyuan Huang1,2 Yunchong Song1 Jiayue Zhou2 Zhouhan Lin1 1LUMIA Lab, Shanghai Jiao Tong University 2Paris Elite Institute of Technology, Shanghai Jiao Tong University |
| Pseudocode | No | The paper describes procedures and mathematical formulations but does not present them in a structured pseudocode or algorithm block. |
| Open Source Code | Yes | Code is available at https: //github.com/LUMIA-Group/Cluster-wise-Graph-Transformer. |
| Open Datasets | Yes | We conduct experiments on eight graph classification datasets from different domains, including social networks and biology. (...) Table 3: Summary statistics of datasets (IMDB-BINARY, IMDB-MULTI, COLLAB, MUTAG, PROTEINS, D&D, ZINC, Mol HIV) |
| Dataset Splits | Yes | Moreover, 10 percent of the training data is allocated as validation data to ensure a fair comparison, as per [10]. We utilize a standard train/validation/test dataset split following [18]. |
| Hardware Specification | Yes | All experiments are conducted on NVIDIA RTX 3090s with 24GB of RAM. |
| Software Dependencies | No | The model is implemented using Py Torch and Py G [11]. |
| Experiment Setup | Yes | For optimization, the Adam [26] optimizer is utilized, adhering to the default settings of β1 = 0.9, β2 = 0.999, and ε = 1e 8. An early stopping criterion is implemented, halting training if there is no improvement in validation loss over 50 epochs. The training process is capped at a maximum of 500 epochs. We use a batch size of 64. |