Robust Multiple Kernel K-means Using L21-Norm
Authors: Liang Du, Peng Zhou, Lei Shi, Hanmo Wang, Mingyu Fan, Wenjian Wang, Yi-Dong Shen
IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments well demonstrate the effectiveness of the proposed algorithms. Experimental results on benchmark data sets have shown that the proposed approaches achieve better clustering results in both the single kernel and multiple kernel learning settings. |
| Researcher Affiliation | Academia | 1State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences 2School of Computer and Information Technology, Shanxi University 3University of Chinese Academy of Sciences 4Institute of Intelligent System and Decision, Wenzhou University |
| Pseudocode | Yes | Algorithm 1 The algorithm of RMKKM |
| Open Source Code | Yes | 3For the purpose of reproducibility, we provide the code at https://github.com/csliangdu/RMKKM. |
| Open Datasets | Yes | We collect a variety of data sets, including 6 image data sets and 3 text corpora, most of which have been frequently used to evaluate the performance of different clustering algorithms. The statistics of these data sets are summarized in Table 1. (Table 1 lists: YALE, JAFFE, ORL, AR, COIL20, BA, TR11, TR41, TR45) |
| Dataset Splits | No | The paper states, “As suggested in [Yang et al., 2010], we independently repeat the experiments for 20 times with random initializations and report the best results corresponding to the best objective values.” This describes the experimental repetition and result selection but does not provide specific train/validation/test dataset splits (e.g., percentages, sample counts, or cross-validation details) for reproducing data partitioning. |
| Hardware Specification | No | The paper does not contain any information about specific hardware used for running its experiments (e.g., CPU, GPU models, memory, or cloud instance types). |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers (e.g., specific libraries or solvers with their versions). |
| Experiment Setup | Yes | For the proposed method RMKKM, the parameter γ to control the kernel weight distribution is set to 0.3. In addition, the results of all these compared algorithms depend on the initialization. As suggested in [Yang et al., 2010], we independently repeat the experiments for 20 times with random initializations and report the best results corresponding to the best objective values. Following the similar strategy of other multiple kernel learning approaches, we apply 12 different kernel functions as basis for multiple kernel clustering. These kernels include, seven RBF kernels K(xi, xj) = exp( ||xi xj||2/2δ2) with δ = t D0, where D0 is the maximum distance between samples and t varies in the range of {0.01, 0.05, 0.1, 1, 10, 50, 100}, four polynomial kernels K(xi, xj) = (a+x T i xj)b with a = {0, 1} and b = {2, 4} and a cosine kernel K(xi, xj) = (x T i xj)/(||xi|| ||x||). Finally, all the kernels have been normalized through K(xi, xj) = K(xi, xj)/ p K(xi, xi)K(xj, xj) and then rescaled to [0, 1]. The number of clusters is set to the true number of classes for all the data sets and clustering algorithms. |