Scalable Laplacian K-modes

Authors: Imtiaz Ziko, Eric Granger, Ismail Ben Ayed

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We report comprehensive experiments over various data sets, which show that our algorithm yields very competitive performances in term of optimization quality (i.e., the value of the discrete-variable objective at convergence) and clustering accuracy.
Researcher Affiliation Academia Imtiaz Masud Ziko ÉTS Montreal Eric Granger ÉTS Montreal Ismail Ben Ayed ÉTS Montreal
Pseudocode Yes Algorithm 1: SLK algorithm
Open Source Code Yes Code is available at: https://github.com/imtiazziko/SLK
Open Datasets Yes We used image datasets, except Shuttle and Reuters. The overall summary of the datasets is given in Table 1. For each dataset, imbalance is defined as the ratio of the size of the biggest cluster to the size of the smallest one. We use three versions of MNIST [17].
Dataset Splits Yes We choose the best initial seed and regularization parameter λ empirically based on the accuracy over a validation set (10% of the total data).
Hardware Specification Yes All the experiments (our methods and the baselines) were conducted on a machine with Xeon E5-2620 CPU and a Titan X Pascal GPU.
Software Dependencies No The paper mentions using libraries like Flann but does not provide specific version numbers for software dependencies for their implementation.
Experiment Setup Yes In all of the datasets, we fixed ρ = 5. For the large datasets such as MNIST, Shuttle and Reuters, we used the Flann library [19] with the KD-tree algorithm, which finds approximate nearest neighbors. Mode estimation is based on the Gaussian kernel k(x, y) = e ( x y) 2/2σ2), with σ2 estimated as: σ2 = 1 Nρ Pxq N ρ p xp xq 2. Initial centers {m0 l }L l=1 are based on K-means++ seeds [1]. We choose the best initial seed and regularization parameter λ empirically based on the accuracy over a validation set (10% of the total data). The λ is determined from tuning in a small range from 1 to 4.