Deep Linear Coding for Fast Graph Clustering

Authors: Ming Shao, Sheng Li, Zhengming Ding, Yun Fu

IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on clustering tasks demonstrate that our method performs well in terms of both time complexity and clustering accuracy. On a large-scale benchmark dataset (580K), our method runs 1500 times faster than the original spectral clustering.
Researcher Affiliation Academia Ming Shao, Sheng Li, Zhengming Ding Department of ECE Northeastern University Boston, MA 02115, USA {mingshao,shengli,allanding}@ece.neu.edu Yun Fu Department of ECE, College of CIS Northeastern University Boston, MA 02115, USA {yunfu}@ece.neu.edu
Pseudocode Yes Algorithm 1: Algorithm of Single-layer Linear Coding. Algorithm 2: Algorithm of Deep Linear Coding (DLC).
Open Source Code No The paper does not provide any concrete access to source code (specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described.
Open Datasets Yes Corel The dataset has been widely used in computer vision and image processing. Coil20 An object image database with 20 different objects. Yale B This database is popular in face recognition algorithms evaluations... Pendigit This is a handwritten digit data set... Letter The dataset consists of 26 capital letters... Mnist Another handwritten digits benchmark dataset widely used in clustering evaluations. Covtype A large scale scientific dataset...
Dataset Splits No The paper uses various datasets for evaluation but does not specify explicit train/validation/test dataset splits or cross-validation setup for reproducibility.
Hardware Specification No The paper does not provide any specific hardware details (exact GPU/CPU models, processor types, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper mentions 'Matlab implementation' and 'FLANN library' but does not provide specific version numbers for these or any other software dependencies, making the environment unreproducible.
Experiment Setup Yes We set the number of neighbors in k NN search at 5, and the number of landmarks in the first and second layers at 1000 unless otherwise specified. In addition, we set both the balancing parameter λ and Gaussian kernel bandwidth σ at 1. To balance the performance and speed, the number of iterations in each layer is set to T = 5.