Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
A Robust Convex Formulation for Ensemble Clustering
Authors: Junning Gao, Makoto Yamada, Samuel Kaski, Hiroshi Mamitsuka, Shanfeng Zhu
IJCAI 2016 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We first showed that using synthetic data experiments, RCEC could learn stable cluster assignments from the input matrix including anomalous clusters. We then showed that RCEC outperformed state-of-the-art ensemble clustering methods by using real-world data sets. |
| Researcher Affiliation | Academia | Junning Gao,1 Makoto Yamada,2 Samuel Kaski,2,3 Hiroshi Mamitsuka,2,3 Shanfeng Zhu1 1 School of Computer Science and Shanghai Key Lab of Intelligent Information Processing Fudan University, Shanghai, China. 2 Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, Japan. 3 Department of Computer Science, Aalto University, Finland. |
| Pseudocode | Yes | Algorithm 1 The RCEC algorithm |
| Open Source Code | No | No concrete access to source code (specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described in this paper was provided. |
| Open Datasets | Yes | Tr11 [Karypis, 2002], K1b [Karypis, 2002], ORL [Cai et al., 2006] |
| Dataset Splits | No | The paper describes varying input feature ratios and repetitions of experiments, but does not specify clear train/validation/test dataset splits needed for reproduction. It mentions "randomly chose 60%, 70%, . . . , 100% of the entire features for experiments" and "repeated the experiment 10 times by changing the random seed" but this is not a dataset split. |
| Hardware Specification | No | No specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running experiments are provided. |
| Software Dependencies | No | No specific ancillary software details (e.g., library or solver names with version numbers) are provided. |
| Experiment Setup | Yes | We used λ = 0.1, γ = 0.01, and β = {0.01, 1, 2, 4, 6, . . . , 20}. |