Enhancing Ensemble Clustering with Adaptive High-Order Topological Weights

Authors: Jiaxuan Xu, Taiyong Li, Lei Duan

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on multiple datasets demonstrate the effectiveness of the proposed method.
Researcher Affiliation Academia 1 School of Computer Science, Sichuan University, Chengdu, China 2 School of Computing and Artificial Intelligence, Southwestern University of Finance and Economics, Chengdu, China
Pseudocode Yes Algorithm 1: ALM update Z and Algorithm 2: AWEC are provided.
Open Source Code Yes The source code of the proposed approach is available at https://github.com/ltyong/awec.
Open Datasets Yes We conduct extensive experiments on 14 real datasets from different domains. Characteristics of these datasets are provided in Table 1. We randomly run the k-means algorithm 100 times on each dataset (http://archive.ics.uci.edu/datasets) (Huang, Wang, and Lai 2017; Zhou, Zheng, and Pan 2019; Yu et al. 2022) to generate the base clustering result set.
Dataset Splits No The paper mentions running k-means on datasets to generate base clustering results and conducting repeated experiments, but it does not specify explicit train/validation/test dataset splits or their percentages/counts for reproduction.
Hardware Specification No The paper does not provide specific hardware details such as GPU or CPU models, memory, or cloud instance types used for running the experiments.
Software Dependencies No The paper mentions algorithms and methods used (e.g., k-means, ADMM, spectral clustering) but does not list specific software dependencies with version numbers (e.g., Python, PyTorch, scikit-learn versions).
Experiment Setup Yes For the number of neighbors parameter in Eq. (4), we set it to 0.5s in all datasets, where s = n/c represents the average sample number in each category. AWEC has two main parameters: the noise regularization parameter λ and the parameter γ. We perform a 6*6 grid search for λ in the set {0.01, 0.02, 0.04, 0.08, 0.1, 0.2} and γ in the set {0.1, 0.5, 1, 5, 10, 50}. We set the ensemble size M = 20, conduct 10 repeated experiments with different base clustering combinations.