On Convergence of Epanechnikov Mean Shift

Authors: Kejun Huang, Xiao Fu, Nicholas Sidiropoulos

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show surprisingly good performance compared to the Lloyd s K-means algorithm and the EM algorithm. Illustrative Example Specifically, we test the performance of the proposed deflation-based Epanechnikov Mean Shift and some classic clustering methods, including Lloyd s K-means algorithm, Expectation Maximization (EM) for Gaussian mixture models (GMM), the two-round variant of EM by (Dasgupta and Schulman 2000), and the original Epanechnikov Mean Shift.
Researcher Affiliation Academia Kejun Huang University of Minnesota Minneapolis, MN 55414 huang663@umn.edu Xiao Fu Oregon State University Corvallis, OR 97331 xiao.fu@oregonstate.edu Nicholas D. Sidiropoulos University of Virginia Charlottesville, VA 22904 nikos@virginia.edu
Pseudocode Yes Algorithm 1 Epanechnikov Mean Shift, Algorithm 2 Epanechnikov Mean Shift iterates Redux, Algorithm 3 Epanechnikov Mean Shift deflation
Open Source Code No The paper does not provide an explicit statement about the release of source code or a link to a code repository for the methodology described.
Open Datasets No The experiments are conducted on a synthetic dataset whose generation process is described, but no public access information (link, DOI, formal citation) is provided for the dataset itself.
Dataset Splits No The paper describes generating synthetic data and repeating simulations, and mentions 'leave-one-out cross-validation' for tuning a parameter, but does not specify explicit training, validation, and test dataset splits with percentages or sample counts for the main evaluation setup.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory specifications) used for running the experiments.
Software Dependencies No The paper mentions that 'The experiment is conducted in MATLAB', but no specific version numbers for MATLAB or any other software dependencies are provided.
Experiment Setup Yes For d = 100, we prescribe K = 30 clusters (Gaussian components). For cluster k, we first randomly generate its centroid μk N(0, 4I), and then generate Mk = 50k i.i.d. data points from N(μk, I). The procedure is repeated 30 times. The only parameter (kernel bandwidth) is tuned by leave-one-out cross-validation.