reproducibilityindex.ai

A General Clustering Agreement Index: For Comparing Disjoint and Overlapping Clusters

Authors: Reihaneh Rabbany, Osmar Zaane

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical comparison of four well-known overlapping community detection methods, obtained based on the previous and the newly proposed agreement indexes. In more detail, we use the overlapping LFR benchmark generators (Lancichinetti, Fortunato, and Kertesz 2008), to synthesize benchmarks with varying fraction of overlapping nodes (10 realizations for each setting to report the average).
Researcher Affiliation	Academia	Reihaneh Rabbany, Osmar R. Za ıane Department of Computing Science, University of Alberta Edmonton, AB, Canada {rabbanyk, zaiane}@ualberta.ca
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described in this paper.
Open Datasets	No	The paper mentions using 'overlapping LFR benchmark generators (Lancichinetti, Fortunato, and Kertesz 2008)' to synthesize datasets. While it cites the generator, it does not provide concrete access information (e.g., URL, DOI, or specific parameters for exact reproduction) for the specific datasets generated and used in the experiments described in this paper.
Dataset Splits	No	The paper does not provide specific dataset split information (e.g., exact percentages, sample counts, or detailed splitting methodology) for training, validation, or testing.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup	Yes	The experimental settings are as follows. First, we generate a set of benchmark datasets using a generator which synthesizes networks with built-in ground-truth communities. Then, these datasets are clustered with different community detection algorithms. Finally, the results obtained from different algorithms are compared against the ground-truth in these benchmarks, using a clustering agreement index. In more detail, we use the overlapping LFR benchmark generators (Lancichinetti, Fortunato, and Kertesz 2008), to synthesize benchmarks with varying fraction of overlapping nodes (10 realizations for each setting to report the average). The overlapping community detection methods included in this comparison are: COPRA (Gregory 2010), MOSES (Mc Daid and Hurley 2010), OSLOM (Lancichinetti et al. 2011), and BIGCLAM (Yang and Leskovec 2013). We apply the overlapping extensions of NMI, i.e., NMI by Lancichinetti, Fortunato, and Kertesz (2008) and NMI by Mc Daid, Greene, and Hurley (2011); the adjusted omega index (Aω) by Collins and Dent (1988); the δ-based formulations for the ARI by Rabbany and Za ıane (2015), to compare them against the ARI and NMI overlapping extensions presented in this paper, i.e., CRI and CMI derived from our CAI generalization.