reproducibilityindex.ai

On Convergence of Epanechnikov Mean Shift

Authors: Kejun Huang, Xiao Fu, Nicholas Sidiropoulos

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments show surprisingly good performance compared to the Lloyd s K-means algorithm and the EM algorithm. Illustrative Example Speciﬁcally, we test the performance of the proposed deﬂation-based Epanechnikov Mean Shift and some classic clustering methods, including Lloyd s K-means algorithm, Expectation Maximization (EM) for Gaussian mixture models (GMM), the two-round variant of EM by (Dasgupta and Schulman 2000), and the original Epanechnikov Mean Shift.
Researcher Affiliation	Academia	Kejun Huang University of Minnesota Minneapolis, MN 55414 huang663@umn.edu Xiao Fu Oregon State University Corvallis, OR 97331 xiao.fu@oregonstate.edu Nicholas D. Sidiropoulos University of Virginia Charlottesville, VA 22904 nikos@virginia.edu
Pseudocode	Yes	Algorithm 1 Epanechnikov Mean Shift, Algorithm 2 Epanechnikov Mean Shift iterates Redux, Algorithm 3 Epanechnikov Mean Shift deﬂation
Open Source Code	No	The paper does not provide an explicit statement about the release of source code or a link to a code repository for the methodology described.
Open Datasets	No	The experiments are conducted on a synthetic dataset whose generation process is described, but no public access information (link, DOI, formal citation) is provided for the dataset itself.
Dataset Splits	No	The paper describes generating synthetic data and repeating simulations, and mentions 'leave-one-out cross-validation' for tuning a parameter, but does not specify explicit training, validation, and test dataset splits with percentages or sample counts for the main evaluation setup.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory specifications) used for running the experiments.
Software Dependencies	No	The paper mentions that 'The experiment is conducted in MATLAB', but no specific version numbers for MATLAB or any other software dependencies are provided.
Experiment Setup	Yes	For d = 100, we prescribe K = 30 clusters (Gaussian components). For cluster k, we ﬁrst randomly generate its centroid μk N(0, 4I), and then generate Mk = 50k i.i.d. data points from N(μk, I). The procedure is repeated 30 times. The only parameter (kernel bandwidth) is tuned by leave-one-out cross-validation.