Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Aligning Contrastive Multiple Clusterings with User Interests

Authors: Shan Zhang, Liangrui Ren, Jun Wang, Yanyu Xu, Carlotta Domeniconi, Guoxian Yu

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on benchmark datasets show that CMClusts can generate interpretable and high-quality clusterings, which reflect different user interests. We conduct experiments using real-world benchmark datasets, and compare CMClusts against representative and competitive multiple clustering methods.
Researcher Affiliation Academia Shan Zhang1 , Liangrui Ren1 , Jun Wang1 , Yanyu Xu1 , Carlotta Domeniconi2 , Guoxian Yu1 1School of Software, Shandong University, Jinan, China 2Department of Computer Science, George Mason University, VA, USA
Pseudocode Yes Algorithm 1 lists the procedure of CMClusts.
Open Source Code Yes The source code of CMClusts is available at https://www.sduidea.cn/codes.php?name=CMClusts.
Open Datasets Yes Seven benchmark datasets (ALOI [Geusebroek et al., 2005], Fruit [Yao et al., 2023], CMUFace [Ren et al., 2023b], COIL [Nayar, 1996], Cards [Yao et al., 2023], Web KB and Mice [Ren et al., 2023b]) are used to evaluate the performance of CMClusts and other baselines. These datasets have been widely used to validate multiple clustering methods [Bailey, 2018; Yu et al., 2024].
Dataset Splits No No specific training/test/validation dataset splits are provided in the main text. The paper mentions: "More details are provided in the supplementary file." regarding datasets.
Hardware Specification Yes Additionally, all the methods are implemented in Py Torch 2.4 and tested on a server with NVIDIA L40 GPUs.
Software Dependencies Yes All the methods are implemented in Py Torch 2.4 and tested on a server with NVIDIA L40 GPUs.
Experiment Setup No No specific hyperparameter values or detailed training configurations are provided in the main text. The paper states: "And the α and β are hyperparameters. For a detailed analysis, please refer to the supplementary file."