reproducibilityindex.ai

Cluster-wise Graph Transformer with Dual-granularity Kernelized Attention

Authors: Siyuan Huang, Yunchong Song, Jiayue Zhou, Zhouhan Lin

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The resulting architecture, Cluster-wise Graph Transformer (Cluster-GT), which uses node clusters as tokens and employs our proposed N2C-Attn module, shows superior performance on various graph-level tasks. Code is available at https: //github.com/LUMIA-Group/Cluster-wise-Graph-Transformer. To evaluate the performance of Cluster-GT, we compare it against two categories of methods: Graph Pooling and Graph Transformers. We conduct experiments on eight graph classification datasets from different domains, including social networks and biology.
Researcher Affiliation	Academia	Siyuan Huang1,2 Yunchong Song1 Jiayue Zhou2 Zhouhan Lin1 1LUMIA Lab, Shanghai Jiao Tong University 2Paris Elite Institute of Technology, Shanghai Jiao Tong University
Pseudocode	No	The paper describes procedures and mathematical formulations but does not present them in a structured pseudocode or algorithm block.
Open Source Code	Yes	Code is available at https: //github.com/LUMIA-Group/Cluster-wise-Graph-Transformer.
Open Datasets	Yes	We conduct experiments on eight graph classification datasets from different domains, including social networks and biology. (...) Table 3: Summary statistics of datasets (IMDB-BINARY, IMDB-MULTI, COLLAB, MUTAG, PROTEINS, D&D, ZINC, Mol HIV)
Dataset Splits	Yes	Moreover, 10 percent of the training data is allocated as validation data to ensure a fair comparison, as per [10]. We utilize a standard train/validation/test dataset split following [18].
Hardware Specification	Yes	All experiments are conducted on NVIDIA RTX 3090s with 24GB of RAM.
Software Dependencies	No	The model is implemented using Py Torch and Py G [11].
Experiment Setup	Yes	For optimization, the Adam [26] optimizer is utilized, adhering to the default settings of β1 = 0.9, β2 = 0.999, and ε = 1e 8. An early stopping criterion is implemented, halting training if there is no improvement in validation loss over 50 epochs. The training process is capped at a maximum of 500 epochs. We use a batch size of 64.