reproducibilityindex.ai

Scalable Laplacian K-modes

Authors: Imtiaz Ziko, Eric Granger, Ismail Ben Ayed

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We report comprehensive experiments over various data sets, which show that our algorithm yields very competitive performances in term of optimization quality (i.e., the value of the discrete-variable objective at convergence) and clustering accuracy.
Researcher Affiliation	Academia	Imtiaz Masud Ziko ÉTS Montreal Eric Granger ÉTS Montreal Ismail Ben Ayed ÉTS Montreal
Pseudocode	Yes	Algorithm 1: SLK algorithm
Open Source Code	Yes	Code is available at: https://github.com/imtiazziko/SLK
Open Datasets	Yes	We used image datasets, except Shuttle and Reuters. The overall summary of the datasets is given in Table 1. For each dataset, imbalance is deﬁned as the ratio of the size of the biggest cluster to the size of the smallest one. We use three versions of MNIST [17].
Dataset Splits	Yes	We choose the best initial seed and regularization parameter λ empirically based on the accuracy over a validation set (10% of the total data).
Hardware Specification	Yes	All the experiments (our methods and the baselines) were conducted on a machine with Xeon E5-2620 CPU and a Titan X Pascal GPU.
Software Dependencies	No	The paper mentions using libraries like Flann but does not provide specific version numbers for software dependencies for their implementation.
Experiment Setup	Yes	In all of the datasets, we ﬁxed ρ = 5. For the large datasets such as MNIST, Shuttle and Reuters, we used the Flann library [19] with the KD-tree algorithm, which ﬁnds approximate nearest neighbors. Mode estimation is based on the Gaussian kernel k(x, y) = e ( x y) 2/2σ2), with σ2 estimated as: σ2 = 1 Nρ Pxq N ρ p xp xq 2. Initial centers {m0 l }L l=1 are based on K-means++ seeds [1]. We choose the best initial seed and regularization parameter λ empirically based on the accuracy over a validation set (10% of the total data). The λ is determined from tuning in a small range from 1 to 4.