reproducibilityindex.ai

Optimal Neighborhood Kernel Clustering with Multiple Kernels

Authors: Xinwang Liu, Sihang Zhou, Yueqing Wang, Miaomiao Li, Yong Dou, En Zhu, Jianping Yin

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments have been conducted to evaluate the clustering performance of the proposed algorithm. As demonstrated, our algorithm signiﬁcantly outperforms the state-of-the-art ones in the literature, verifying the effectiveness and advantages of ONKC. Comprehensive experimental study has been conducted on 16 multiple kernel learning (MKL) benchmark data sets to compare the clustering performance of the proposed algorithm with several state-of-the-art ones.
Researcher Affiliation	Academia	Xinwang Liu, Sihang Zhou, Yueqing Wang, Miaomiao Li, Yong Dou, En Zhu, Jianping Yin School of Computer, National University of Defense Technology, Changsha, China, 410073
Pseudocode	Yes	Algorithm 1 Proposed Optimal Neighborhood Kernel Clustering with Multiple Kernels
Open Source Code	No	The Matlab implementation of KKM, MKKM and LMKKM are publicly available from the website4. For RMKKM, CRSC, RMSC, RMKC and MKKM-MR, we use their Matlab codes which are freely downloaded from authors websites in our experiments. This refers to the code of compared algorithms, not the proposed method's own source code.
Open Datasets	Yes	We evaluate the clustering performance of the proposed algorithm on 16 benchmark data sets from various applications, including image recognition, gesture recognition, protein subcellular localization. The detailed information of these data sets is listed in Table 1. From this tabler, we observe that the number of samples, kernels and categories of these data sets show considerable variations, which provides a good platform to compare the performance of different clustering algorithms. We then show how to construct base kernels for these data sets. For the ﬁrst nine data sets, all kernel matrices are pre-computed and publicly available from websites1,2,3.
Dataset Splits	No	The paper states, 'For all algorithms, we repeat each experiment for 50 times with random initialization to reduce the affect of randomness caused by k-means, and report the best result.' It does not specify train/validation/test splits, only that the true number of clusters is known and set as the true number of classes.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions 'Matlab implementation' for some compared algorithms but does not provide specific version numbers for Matlab or any other software dependencies used for the proposed method.
Experiment Setup	Yes	The parameters of RMKKM, RMSC and RMKC are selected by grid search according to the suggestions in their papers. For the proposed algorithm, its regularization parameters λ and ρ are both chosen from a large enough range [2 15, 2 13, , 215] by grid search. The clustering performance of all compared algorithms are evaluated in terms of three widely used criteria, including clustering accuracy (ACC), normalized mutual information (NMI) and purity. For all algorithms, we repeat each experiment for 50 times with random initialization to reduce the affect of randomness caused by k-means, and report the best result.