reproducibilityindex.ai

GLCC: A General Framework for Graph-Level Clustering

Authors: Wei Ju, Yiyang Gu, Binqi Chen, Gongbo Sun, Yifang Qin, Xingyuming Liu, Xiao Luo, Ming Zhang

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on a range of well-known datasets demonstrate the superiority of our proposed GLCC over competitive baselines.
Researcher Affiliation	Academia	1School of Computer Science, Peking University, China 2School of EECS, Peking University, China 3Beijing National Day School, China 4Department of Computer Science, University of California Los Angeles, USA
Pseudocode	Yes	Algorithm 1: Optimization Algorithm of GLCC
Open Source Code	No	The paper does not provide an explicit statement or link to the source code for the methodology described in the paper. The only link provided is to a dataset platform (anchorquery.csb.pitt.edu).
Open Datasets	Yes	We conduct extensive experiments on two kinds of datasets: biochemical molecule datasets and social network datasets. For biochemical molecule datasets, we adopt DD from TU datasets (Morris et al. 2020), and Anchor Query collected from Anchor Query platform1 to test clustering performance on large cluster number. Specifically, we construct Anchor Query-10K and Anchor Query-25K datasets with compounds generated from 10 and 25 types of multicomponent reactions (MCRs), respectively. For social network datasets, we adopt IMDB-B, REDDIT-B, and REDDIT-12K datasets from TU datasets.
Dataset Splits	No	The paper does not explicitly provide details about training, validation, and test dataset splits, only mentioning batch sizes and total epochs.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions using "Graph Isomorphism Network (GIN) (Xu et al. 2019) as the backbone" but does not provide specific version numbers for GIN or any other software dependencies.
Experiment Setup	Yes	For a fair comparison with previous graph contrastive learning methods, we adopt Graph Isomorphism Network (GIN) (Xu et al. 2019) as the backbone for all baselines. The number of GIN layers is 3, and the hidden dimension is set to 64. The batch size is set to 64 for DD, IMDB-B, and REDDIT-B, and 256 for Anchor Query-10K, Anchor Query-25K, and REDDIT-12K. The temperatures in instanceand cluster-level graph contrastive losses are set to 0.1 and 1.0, respectively. For affinity graph construction, we set neighbor number k = 5. Pseudo-label ratio r is set to 0.1. The perturbation ratio of graph augmentation is set to 0.1. The total number of training epochs is set to 100.