Spectral Clustering via the Power Method - Provably

Authors: Christos Boutsidis, Prabhanjan Kambadur, Alex Gittens

ICML 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5. Experiments To conduct our experiments, we developed high-quality MATLAB versions of the spectral clustering algorithms. In the remainder of this section, we refer to the clustering algorithm in (Shi & Malik, 2000a) as exact algorithm . We refer to the modified version that uses the power method as approximate algorithm . To measure clustering quality, we used normalized mutual information (Manning et al., 2008):
Researcher Affiliation Collaboration Christos Boutsidis BOUTSIDIS@YAHOO-INC.COM Yahoo, 229 West 43rd Street, New York, NY, USA. Alex Gittens GITTENS@ICSI.BERKELEY.EDU International Computer Science Institute, Berkeley, CA, USA. Prabhanjan Kambadur PKAMBADUR@BLOOMBERG.NET Bloomberg L.P., 731 Lexington Avenue, New York, 10022, USA.
Pseudocode No The paper describes the spectral clustering algorithm and the power method using numbered steps within the text, but these are not presented as formally structured pseudocode or algorithm blocks.
Open Source Code No The paper states 'To conduct our experiments, we developed high-quality MATLAB versions of the spectral clustering algorithms.' but does not provide concrete access to this code or state that it is open-source.
Open Datasets Yes We ran our experiments on four multi-class datasets from the lib SVMTools webpage (Table 1). ... The lib SVM multi-class classification datasets (Chang & Lin, 2011) used for our spectral clustering experiments. Software available at http://www.csie. ntu.edu.tw/~cjlin/libsvm.
Dataset Splits No The paper discusses the use of multi-class datasets for spectral clustering experiments but does not provide specific details on training, validation, or test dataset splits needed for reproducibility.
Hardware Specification Yes All our experiments were run using MATLAB 8.1.0.604 (R2013a) on a 1.4 GHz Intel Core i5 dual-core processor running OS X 10.9.5 with 8GB 1600 MHz DDR3 RAM.
Software Dependencies Yes All our experiments were run using MATLAB 8.1.0.604 (R2013a)... The exact algorithm uses MATLAB s svds function... The approximate algorithm exploits the tallthin structure of B and computes Y using MATLAB s svd function. The approximate algorithm uses MATLAB s normrnd function to generate the random Gaussian matrix S. We used MATLAB s kmeans function with the options Empty Action , singleton , Max Iter , 100, Replicates , 10.
Experiment Setup Yes To compute W, we use the heat kernel:Wij = e ( xi xj 2)/σij, where xi Rd and xj Rd are the data points and σij is a tuning parameter; σij is determined using the self-tuning method described in (Zelnik-Manor & Perona, 2004). That is, for each data point i, xi is computed to be the Euclidean distance of the ℓth furthest neighbor from i; then σij is set to be xixj for every (i, j); in our experiments, we report the results for ℓ= 7. ... We used MATLAB s kmeans function with the options Empty Action , singleton , Max Iter , 100, Replicates , 10. ... we varied p from 0 to 10.