A Tighter Analysis of Spectral Clustering, and Beyond
Authors: Peter Macgregor, He Sun
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Besides its conceptual and theoretical significance, the practical impact of our work is demonstrated by the empirical analysis on both synthetic and real-world datasets, in which spectral clustering produces comparable or better results with fewer than k eigenvectors. |
| Researcher Affiliation | Academia | 1School of Informatics, University of Edinburgh, Edinburgh, United Kingdom. Correspondence to: Peter Macgregor <peter.macgregor@ed.ac.uk>, He Sun <h.sun@ed.ac.uk>. |
| Pseudocode | No | The paper describes the steps of the spectral clustering algorithm in Section 4 using numbered lists but does not present a formal pseudocode block or an algorithm box. |
| Open Source Code | Yes | We detail the experiment setup in the appendix, and the code to reproduce our results is available at https://github.com/pmacg/ spectral-clustering-meta-graphs. |
| Open Datasets | Yes | Our work is also related to studies on designing local, and distributed clustering algorithms based on different assumptions (e.g., (Czumaj et al., 2015; Orecchia & Zhu, 2014; Zhu et al., 2013)); ... The significance of this result is further demonstrated by our extensive experimental analysis on the well-known BSDS, MNIST, and USPS datasets (Arbelaez et al., 2011; Hull, 1994; Le Cun et al., 1998). |
| Dataset Splits | No | The paper does not explicitly describe train/validation/test splits, only mentions that the BSDS dataset 'consists of 500 images along with their ground-truth segmentations' and 'construct the k-nearest neighbour graph for k = 3' for MNIST/USPS. |
| Hardware Specification | Yes | Our experiments on synthetic data are performed on a desktop computer with an Intel(R) Core(TM) i5-8500 CPU @ 3.00GHz processor and 16 GB RAM. The experiments on the BSDS, MNIST, and USPS datasets are performed on a compute server with 64 AMD EPYC 7302 16-Core Processors. |
| Software Dependencies | No | We implement the spectral clustering algorithm in Python, using the scipy library for computing eigenvectors, and the k-means algorithm from the sklearn library. While it mentions the libraries, specific version numbers are not provided. |
| Experiment Setup | No | The paper describes general aspects of its experiment setup, such as how similarity graphs are constructed (e.g., 'add an edge with weight exp( u v 2 /2σ2) where σ = 20'), but it does not specify hyperparameters like learning rates, batch sizes, or optimizer settings typically found in 'experimental setup' sections for training machine learning models. |