Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Enhancing Ensemble Clustering with Adaptive High-Order Topological Weights
Authors: Jiaxuan Xu, Taiyong Li, Lei Duan
AAAI 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on multiple datasets demonstrate the effectiveness of the proposed method. |
| Researcher Affiliation | Academia | 1 School of Computer Science, Sichuan University, Chengdu, China 2 School of Computing and Artificial Intelligence, Southwestern University of Finance and Economics, Chengdu, China |
| Pseudocode | Yes | Algorithm 1: ALM update Z and Algorithm 2: AWEC are provided. |
| Open Source Code | Yes | The source code of the proposed approach is available at https://github.com/ltyong/awec. |
| Open Datasets | Yes | We conduct extensive experiments on 14 real datasets from different domains. Characteristics of these datasets are provided in Table 1. We randomly run the k-means algorithm 100 times on each dataset (http://archive.ics.uci.edu/datasets) (Huang, Wang, and Lai 2017; Zhou, Zheng, and Pan 2019; Yu et al. 2022) to generate the base clustering result set. |
| Dataset Splits | No | The paper mentions running k-means on datasets to generate base clustering results and conducting repeated experiments, but it does not specify explicit train/validation/test dataset splits or their percentages/counts for reproduction. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU or CPU models, memory, or cloud instance types used for running the experiments. |
| Software Dependencies | No | The paper mentions algorithms and methods used (e.g., k-means, ADMM, spectral clustering) but does not list specific software dependencies with version numbers (e.g., Python, PyTorch, scikit-learn versions). |
| Experiment Setup | Yes | For the number of neighbors parameter in Eq. (4), we set it to 0.5s in all datasets, where s = n/c represents the average sample number in each category. AWEC has two main parameters: the noise regularization parameter λ and the parameter γ. We perform a 6*6 grid search for λ in the set {0.01, 0.02, 0.04, 0.08, 0.1, 0.2} and γ in the set {0.1, 0.5, 1, 5, 10, 50}. We set the ensemble size M = 20, conduct 10 repeated experiments with different base clustering combinations. |