The Constrained Laplacian Rank Algorithm for Graph-Based Clustering

Authors: Feiping Nie, Xiaoqian Wang, Michael Jordan, Heng Huang

AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on synthetic datasets and real-world benchmark datasets exhibit the effectiveness of this new graph-based clustering method.
Researcher Affiliation Academia 1Department of Computer Science and Engineering, University of Texas, Arlington 2Departments of EECS and Statistics, University of California, Berkeley
Pseudocode Yes Algorithm 1 Algorithm to solve JCLR L2 in Eq. (1). Algorithm 2 Algorithm to solve JCLR L1 in Eq. (2).
Open Source Code No The paper does not provide explicit statements or links for open-source code for the described methodology.
Open Datasets Yes Yeast (Asuncion and Newman 2007), Abalone (Asuncion and Newman 2007), COIL20 (Nene, Nayar, and Murase 1996b), COIL100 (Nene, Nayar, and Murase 1996a), AR (Martinez 1998), XM2VTS (XM2VTS ) and UMIST (Graham and Allinson 1998)
Dataset Splits No The paper discusses synthetic and benchmark datasets and mentions using ground truth for clustering, but does not specify explicit train/validation/test splits or refer to a validation set.
Hardware Specification No The paper does not provide specific details regarding the hardware used for experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers.
Experiment Setup Yes For both self-tune Gaussian and our method, we set the number of neighbors, m, to be five for the affinity matrix construction. As for our clustering method, we determined the value of λ in a heuristic way to accelerate the procedure: first set λ with a small value, then in each iteration, we computed the number of zero eigenvalues in LS, if it was larger than k, we divided λ by two; if smaller we multiplied λ by two; otherwise we stopped the iteration. For all the methods involving K-means, including K-means, RCut and NCut methods, we used the same initialization and repeated 50 times to compute their respective best initialization vector in terms of objective value of K-means.