Chromatic Correlation Clustering, Revisited

Authors: Qing Xiu, Kai Han, Jing Tang, Shuang Cui, He Huang

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct extensive experiments using real-world datasets and the results strongly demonstrate the superiorities of Greedy Vote in terms of both clustering quality and efficiency, compared to the state-of-the-art algorithms with or without performance guarantees.In this section, we compare our Greedy Vote (GV) algorithm (i.e., Algorithm 2) with six state-of-the-art algorithms for the CCC problem, including: (1) The Pivot Algorithm [2]; (2) The Reduce and Cluster (RC) Algorithm [3]; (3) The Deep Cluster (DC) algorithm [3]; (4) The Chromatic Ball (CB) algorithm [6]; (5) The Greedy Expansion (GE) Algorithm [25]; and (6) The modified greedy expansion (GER) Algorithm [25].
Researcher Affiliation Academia Qing Xiu School of Computer Science and Technology Suzhou Institute for Advanced Research University of Science and Technology of China xiuq@mail.ustc.edu.cnKai Han School of Computer Science and Technology Soochow University hankai@suda.edu.cnJing Tang The Hong Kong University of Science and Technology (Guangzhou) The Hong Kong University of Science and Technology jingtang@ust.hkShuang Cui School of Computer Science and Technology Suzhou Institute for Advanced Research University of Science and Technology of China lakers@mail.ustc.edu.cnHe Huang School of Computer Science and Technology Soochow University huangh@suda.edu.cn
Pseudocode Yes Algorithm 1: LP-Clustering and Algorithm 2: Greedy Vote
Open Source Code Yes Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] See the supplemental material
Open Datasets Yes We use ten real-world datasets from different domains in our experiments, as listed in Table 1. Among these datasets, Facebook1, Facebook2, Lastfm and Twitter are social network graphs downloaded from [28], where vertices represent users and edges represent friendships. Following the setting in Anava et al. [3], we assign each social circle a color and set the color of each edge uv as the color of u and v s common social circle (if there are multiple common circles then choose an arbitrary one). The DAWN dataset [1][25] models a drug abuse warning network... The Cooking dataset [23][25] models an ingredient co-occur network... String1 and String2 [12] are networks containing protein-protein interactions... Finally, DBLP and MAG [25] are co-authorship networks...
Dataset Splits No Section 6 describes the datasets used and some parameter settings for the algorithms, but it does not specify how the datasets were split into training, validation, or test sets.
Hardware Specification Yes All our experiments are conducted on an Intel(R) Xeon(R) Gold 6126 CPU @ 2.60GHz with 128GB of RAM.
Software Dependencies No We implement the evaluated algorithms using Python and also use the public code of Klodt et al. [25] for implementation. No specific version numbers for Python or other software libraries are provided.
Experiment Setup Yes In our experiments, we set ϵ = 0.55 and m = 10 for the GV algorithm, and follow all the parameter settings (if any) of other algorithms according to their proposals.