SEC: More Accurate Clustering Algorithm via Structural Entropy

Authors: Junyu Huang, Qilong Feng, Jiahui Wang, Ziyun Huang, Jinhui Xu, Jianxin Wang

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical experiments on both synthetic and real-world datasets demonstrate that our proposed algorithm outperforms stateof-the-art clustering methods and achieves better clustering performances.
Researcher Affiliation Academia 1 School of Computer Science and Engineering, Central South University, Changsha 410083, China 2 Xiangjiang Laboratory, Changsha 410205, China 3 Department of Computer Science and Software Engineering, Penn State Erie, The Behrend College 4 Department of Computer Science and Engineering, State University of New York at Buffalo, NY, USA 5 The Hunan Provincial Key Lab of Bioinformatics, Central South University, Changsha 410083, China
Pseudocode Yes Algorithm 1: SEC
Open Source Code No The paper does not provide any explicit statement or link indicating the availability of open-source code for the described methodology.
Open Datasets Yes all the real-world datasets can be found in the UCI machine learning repository 1. 1https://archive.ics.uci.edu
Dataset Splits No The paper states running algorithms on each dataset multiple times and reporting average results, but it does not provide specific details on dataset splits (e.g., train/validation/test percentages, cross-validation folds, or random seeds for partitioning).
Hardware Specification Yes All the experiments are conducted on 72 Intel Xeon Gold 6230 CPUs with 500GB memory.
Software Dependencies No The paper lists the names of algorithms used for comparison, but it does not provide specific details on the software dependencies, such as programming languages, libraries, or their version numbers, needed to replicate the experiment.
Experiment Setup No The paper states that distances are set as Euclidean, and experiments are run five times, but it does not provide concrete hyperparameter values or detailed training configurations (such as specific values for 'D', 'N', 'ϵ', 'δ', 't' as listed in Algorithm 1's input) used for the experiments.