Bilateral k-Means Algorithm for Fast Co-Clustering

Authors: Junwei Han, Kun Song, Feiping Nie, Xuelong Li

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on various types of data sets are conducted. Compared with the state-of-the-art clustering methods, the proposed BKM not only has faster computational speed, but also achieves promising clustering results.
Researcher Affiliation Academia School of Automation, Northwestern Polytechnical University, Xi an, 710072, Shaanxi, P. R. China School of Computer Science and Center for OPTIMAL, Northwestern Polytechnical University, Xi an, 710072, P. R. China Center for OPTIMAL, State Key Laboratory of Transient Optics and Photonics, Xi an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xi an 710119, Shaanxi, P. R. China.
Pseudocode Yes Algorithm 1 Bilateral k-means algorithm
Open Source Code No The paper does not provide any explicit statement or link regarding the open-sourcing of the code for the described methodology.
Open Datasets Yes By following previous works, we adopt two types of data sets in our experiments, i.e. real world data sets and synthetic data sets. The real world data sets consist of Web KB4 (Ding et al. 2006), Web ACE (Cai, Wu, and Han 2008), CSTR (Gu and Zhou 2009) and RCV1 (Lewis, Rose, and Li 2004), which are summarized in Table 1.
Dataset Splits No The paper does not explicitly provide specific training/validation/test dataset splits needed to reproduce the experiment, such as percentages, sample counts, or explicit splitting methodologies.
Hardware Specification Yes All the experiments run on the computer with Intel (R) Xeon(R) CPU E3-1225 V2, 3.2GHZ CPU and 16.0G memory.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup Yes In the experiments, the number of column clusters is set equal to that of row clusters for all the co-clustering methods. Two parameters of MDSLF are set to equal empirically, and they are determined by method in (Papalexakis 2013), As the order presented above, for the four real data sets, the parameters are {80, 92, 43, 94}, respectively. For the synthetic data set, they are {37, 42, 51}.