Kernel Normalized Cut: a Theoretical Revisit
Authors: Yoshikazu Terada, Michio Yamamoto
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this paper, we study the theoretical properties of clustering based on the kernel normalized cut. Our first contribution is to derive a nonasymptotic upper bound on the expected distortion rate of the kernel normalized cut. From this result, we show that the solution of the kernel normalized cut converges to that of the population-level weighted k-means clustering on a certain reproducing kernel Hilbert space (RKHS). Our second contribution is the discover of the interesting fact that the population-level weighted k-means clustering in the RKHS is equivalent to the population-level normalized cut. Combining these results, we can see that the kernel normalized cut converges to the population-level normalized cut. The criterion of the population-level normalized cut can be considered as an indivisibility of the population distribution, and this criterion plays an important role in the theoretical analysis of spectral clustering in Schiebinger et al. (2015). We believe that our results will provide deep insights into the behavior of both normalized cut and spectral clustering. ... In Section 5, we will look at the difference between spectral clustering and Ncut through the numerical experiments. |
| Researcher Affiliation | Academia | 1Graduate School of Engineering Science, Osaka University, Osaka, Japan 2RIKEN Center for Advanced Intelligence Project (AIP), Tokyo, Japan 3Graduate School of Environmental and Life Science, Okayama University, Okayama, Japan. |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper does not provide any statements or links indicating that source code for the methodology is openly available. |
| Open Datasets | Yes | First, we consider the binary clustering problem for a two-moon data. ... From Li & Chen (2015) and the Berkeley Segmentation Data Set and Benchmarks 500 (BSDS500) in Arbelaez et al. (2011), we selected the two natural images shown in Figure 3. |
| Dataset Splits | No | The paper mentions generating two-moon data and using BSDS500 for image segmentation, but it does not specify explicit training, validation, or test dataset splits (e.g., percentages, sample counts, or predefined splits). |
| Hardware Specification | No | The paper does not specify any hardware details such as CPU, GPU, or memory used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | Here, we use the Gaussian kernel k(x, y) = exp( x y 2/(2σ2)) with σ2 = 1/10 as the kernel function in both spectral clustering and Ncut. ... For several sample sizes n = 200, 400, 800, 1000, we apply both spectral clustering and Ncut with M = 2. ... For the Gaussian kernel k(x, y) = exp( x y 2/(2σ2)), the tuning parameter σ is related to the strength of the nonlinearity. |