SEC: More Accurate Clustering Algorithm via Structural Entropy
Authors: Junyu Huang, Qilong Feng, Jiahui Wang, Ziyun Huang, Jinhui Xu, Jianxin Wang
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical experiments on both synthetic and real-world datasets demonstrate that our proposed algorithm outperforms stateof-the-art clustering methods and achieves better clustering performances. |
| Researcher Affiliation | Academia | 1 School of Computer Science and Engineering, Central South University, Changsha 410083, China 2 Xiangjiang Laboratory, Changsha 410205, China 3 Department of Computer Science and Software Engineering, Penn State Erie, The Behrend College 4 Department of Computer Science and Engineering, State University of New York at Buffalo, NY, USA 5 The Hunan Provincial Key Lab of Bioinformatics, Central South University, Changsha 410083, China |
| Pseudocode | Yes | Algorithm 1: SEC |
| Open Source Code | No | The paper does not provide any explicit statement or link indicating the availability of open-source code for the described methodology. |
| Open Datasets | Yes | all the real-world datasets can be found in the UCI machine learning repository 1. 1https://archive.ics.uci.edu |
| Dataset Splits | No | The paper states running algorithms on each dataset multiple times and reporting average results, but it does not provide specific details on dataset splits (e.g., train/validation/test percentages, cross-validation folds, or random seeds for partitioning). |
| Hardware Specification | Yes | All the experiments are conducted on 72 Intel Xeon Gold 6230 CPUs with 500GB memory. |
| Software Dependencies | No | The paper lists the names of algorithms used for comparison, but it does not provide specific details on the software dependencies, such as programming languages, libraries, or their version numbers, needed to replicate the experiment. |
| Experiment Setup | No | The paper states that distances are set as Euclidean, and experiments are run five times, but it does not provide concrete hyperparameter values or detailed training configurations (such as specific values for 'D', 'N', 'ϵ', 'δ', 't' as listed in Algorithm 1's input) used for the experiments. |