Spectral Clustering via the Power Method - Provably
Authors: Christos Boutsidis, Prabhanjan Kambadur, Alex Gittens
ICML 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5. Experiments To conduct our experiments, we developed high-quality MATLAB versions of the spectral clustering algorithms. In the remainder of this section, we refer to the clustering algorithm in (Shi & Malik, 2000a) as exact algorithm . We refer to the modified version that uses the power method as approximate algorithm . To measure clustering quality, we used normalized mutual information (Manning et al., 2008): |
| Researcher Affiliation | Collaboration | Christos Boutsidis BOUTSIDIS@YAHOO-INC.COM Yahoo, 229 West 43rd Street, New York, NY, USA. Alex Gittens GITTENS@ICSI.BERKELEY.EDU International Computer Science Institute, Berkeley, CA, USA. Prabhanjan Kambadur PKAMBADUR@BLOOMBERG.NET Bloomberg L.P., 731 Lexington Avenue, New York, 10022, USA. |
| Pseudocode | No | The paper describes the spectral clustering algorithm and the power method using numbered steps within the text, but these are not presented as formally structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper states 'To conduct our experiments, we developed high-quality MATLAB versions of the spectral clustering algorithms.' but does not provide concrete access to this code or state that it is open-source. |
| Open Datasets | Yes | We ran our experiments on four multi-class datasets from the lib SVMTools webpage (Table 1). ... The lib SVM multi-class classification datasets (Chang & Lin, 2011) used for our spectral clustering experiments. Software available at http://www.csie. ntu.edu.tw/~cjlin/libsvm. |
| Dataset Splits | No | The paper discusses the use of multi-class datasets for spectral clustering experiments but does not provide specific details on training, validation, or test dataset splits needed for reproducibility. |
| Hardware Specification | Yes | All our experiments were run using MATLAB 8.1.0.604 (R2013a) on a 1.4 GHz Intel Core i5 dual-core processor running OS X 10.9.5 with 8GB 1600 MHz DDR3 RAM. |
| Software Dependencies | Yes | All our experiments were run using MATLAB 8.1.0.604 (R2013a)... The exact algorithm uses MATLAB s svds function... The approximate algorithm exploits the tallthin structure of B and computes Y using MATLAB s svd function. The approximate algorithm uses MATLAB s normrnd function to generate the random Gaussian matrix S. We used MATLAB s kmeans function with the options Empty Action , singleton , Max Iter , 100, Replicates , 10. |
| Experiment Setup | Yes | To compute W, we use the heat kernel:Wij = e ( xi xj 2)/σij, where xi Rd and xj Rd are the data points and σij is a tuning parameter; σij is determined using the self-tuning method described in (Zelnik-Manor & Perona, 2004). That is, for each data point i, xi is computed to be the Euclidean distance of the ℓth furthest neighbor from i; then σij is set to be xixj for every (i, j); in our experiments, we report the results for ℓ= 7. ... We used MATLAB s kmeans function with the options Empty Action , singleton , Max Iter , 100, Replicates , 10. ... we varied p from 0 to 10. |