Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

THESAURUS: Contrastive Graph Clustering by Swapping Fused Gromov-Wasserstein Couplings

Authors: Bowen Deng, Tong Wang, Lele Fu, Sheng Huang, Chuan Chen, Tao Zhang

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To evaluate the performance of THESAURUS, we run the proposed method on nine attribute graph datasets, including Cora, Citeseer, Pubmed, Amazon-Photo (A-Photo), Cora Full, ACM, DBLP, UAT, and Wiki. The baselines are Kmeans, DEC, GRACE (Zhu et al. 2020), SDCN, DFCN, DCRN, S3GC, SCGC, HSAN, and Dink-Net. Our evaluation protocol follows that of the previous SOTA Dink-Net (Liu et al. 2023a). Besides Normalized Mutual Information (NMI) and Adjusted Rand Index (ARI), the metrics include Accuracy (ACC) and the Macro-F1 score (F1)...
Researcher Affiliation	Academia	1School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China 2School of Systems Science and Engineering, Sun Yat-sen University, Guangzhou, China EMAIL, EMAIL
Pseudocode	Yes	The illustration of our proposed THESAURUS. And the details are summarized in Algorithm 1 in the appendix.
Open Source Code	No	The paper states, 'This research utilizes publicly available datasets and comparison methods, all of which are based on open-source code,' but it does not provide an explicit statement or link indicating that the source code for the proposed THESAURUS method is publicly available.
Open Datasets	Yes	To evaluate the performance of THESAURUS, we run the proposed method on nine attribute graph datasets, including Cora, Citeseer, Pubmed, Amazon-Photo (A-Photo), Cora Full, ACM, DBLP, UAT, and Wiki.
Dataset Splits	No	The paper states, 'Our evaluation protocol follows that of the previous SOTA Dink-Net (Liu et al. 2023a),' but it does not explicitly provide specific training/test/validation dataset splits or cross-validation details within the main text.
Hardware Specification	Yes	Part of the results are summarized in Table 1, with OOM indicating out-of-memory failures on one RTX 4090 GPU.
Software Dependencies	No	The paper mentions various algorithms and models such as K-means, GCN, Sinkhorn, and t-SNE, but does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions).
Experiment Setup	No	The paper does not provide specific experimental setup details such as concrete hyperparameter values (e.g., learning rate, batch size, number of epochs, optimizer settings) in the main text.