GLCC: A General Framework for Graph-Level Clustering
Authors: Wei Ju, Yiyang Gu, Binqi Chen, Gongbo Sun, Yifang Qin, Xingyuming Liu, Xiao Luo, Ming Zhang
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on a range of well-known datasets demonstrate the superiority of our proposed GLCC over competitive baselines. |
| Researcher Affiliation | Academia | 1School of Computer Science, Peking University, China 2School of EECS, Peking University, China 3Beijing National Day School, China 4Department of Computer Science, University of California Los Angeles, USA |
| Pseudocode | Yes | Algorithm 1: Optimization Algorithm of GLCC |
| Open Source Code | No | The paper does not provide an explicit statement or link to the source code for the methodology described in the paper. The only link provided is to a dataset platform (anchorquery.csb.pitt.edu). |
| Open Datasets | Yes | We conduct extensive experiments on two kinds of datasets: biochemical molecule datasets and social network datasets. For biochemical molecule datasets, we adopt DD from TU datasets (Morris et al. 2020), and Anchor Query collected from Anchor Query platform1 to test clustering performance on large cluster number. Specifically, we construct Anchor Query-10K and Anchor Query-25K datasets with compounds generated from 10 and 25 types of multicomponent reactions (MCRs), respectively. For social network datasets, we adopt IMDB-B, REDDIT-B, and REDDIT-12K datasets from TU datasets. |
| Dataset Splits | No | The paper does not explicitly provide details about training, validation, and test dataset splits, only mentioning batch sizes and total epochs. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using "Graph Isomorphism Network (GIN) (Xu et al. 2019) as the backbone" but does not provide specific version numbers for GIN or any other software dependencies. |
| Experiment Setup | Yes | For a fair comparison with previous graph contrastive learning methods, we adopt Graph Isomorphism Network (GIN) (Xu et al. 2019) as the backbone for all baselines. The number of GIN layers is 3, and the hidden dimension is set to 64. The batch size is set to 64 for DD, IMDB-B, and REDDIT-B, and 256 for Anchor Query-10K, Anchor Query-25K, and REDDIT-12K. The temperatures in instanceand cluster-level graph contrastive losses are set to 0.1 and 1.0, respectively. For affinity graph construction, we set neighbor number k = 5. Pseudo-label ratio r is set to 0.1. The perturbation ratio of graph augmentation is set to 0.1. The total number of training epochs is set to 100. |