Optimal Neighborhood Kernel Clustering with Multiple Kernels
Authors: Xinwang Liu, Sihang Zhou, Yueqing Wang, Miaomiao Li, Yong Dou, En Zhu, Jianping Yin
AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments have been conducted to evaluate the clustering performance of the proposed algorithm. As demonstrated, our algorithm significantly outperforms the state-of-the-art ones in the literature, verifying the effectiveness and advantages of ONKC. Comprehensive experimental study has been conducted on 16 multiple kernel learning (MKL) benchmark data sets to compare the clustering performance of the proposed algorithm with several state-of-the-art ones. |
| Researcher Affiliation | Academia | Xinwang Liu, Sihang Zhou, Yueqing Wang, Miaomiao Li, Yong Dou, En Zhu, Jianping Yin School of Computer, National University of Defense Technology, Changsha, China, 410073 |
| Pseudocode | Yes | Algorithm 1 Proposed Optimal Neighborhood Kernel Clustering with Multiple Kernels |
| Open Source Code | No | The Matlab implementation of KKM, MKKM and LMKKM are publicly available from the website4. For RMKKM, CRSC, RMSC, RMKC and MKKM-MR, we use their Matlab codes which are freely downloaded from authors websites in our experiments. This refers to the code of compared algorithms, not the proposed method's own source code. |
| Open Datasets | Yes | We evaluate the clustering performance of the proposed algorithm on 16 benchmark data sets from various applications, including image recognition, gesture recognition, protein subcellular localization. The detailed information of these data sets is listed in Table 1. From this tabler, we observe that the number of samples, kernels and categories of these data sets show considerable variations, which provides a good platform to compare the performance of different clustering algorithms. We then show how to construct base kernels for these data sets. For the first nine data sets, all kernel matrices are pre-computed and publicly available from websites1,2,3. |
| Dataset Splits | No | The paper states, 'For all algorithms, we repeat each experiment for 50 times with random initialization to reduce the affect of randomness caused by k-means, and report the best result.' It does not specify train/validation/test splits, only that the true number of clusters is known and set as the true number of classes. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU, GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions 'Matlab implementation' for some compared algorithms but does not provide specific version numbers for Matlab or any other software dependencies used for the proposed method. |
| Experiment Setup | Yes | The parameters of RMKKM, RMSC and RMKC are selected by grid search according to the suggestions in their papers. For the proposed algorithm, its regularization parameters λ and ρ are both chosen from a large enough range [2 15, 2 13, , 215] by grid search. The clustering performance of all compared algorithms are evaluated in terms of three widely used criteria, including clustering accuracy (ACC), normalized mutual information (NMI) and purity. For all algorithms, we repeat each experiment for 50 times with random initialization to reduce the affect of randomness caused by k-means, and report the best result. |