Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

On Convergence of Epanechnikov Mean Shift

Authors: Kejun Huang, Xiao Fu, Nicholas Sidiropoulos

AAAI 2018 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show surprisingly good performance compared to the Lloyd s K-means algorithm and the EM algorithm. Illustrative Example Specifically, we test the performance of the proposed deflation-based Epanechnikov Mean Shift and some classic clustering methods, including Lloyd s K-means algorithm, Expectation Maximization (EM) for Gaussian mixture models (GMM), the two-round variant of EM by (Dasgupta and Schulman 2000), and the original Epanechnikov Mean Shift.
Researcher Affiliation Academia Kejun Huang University of Minnesota Minneapolis, MN 55414 EMAIL Xiao Fu Oregon State University Corvallis, OR 97331 EMAIL Nicholas D. Sidiropoulos University of Virginia Charlottesville, VA 22904 EMAIL
Pseudocode Yes Algorithm 1 Epanechnikov Mean Shift, Algorithm 2 Epanechnikov Mean Shift iterates Redux, Algorithm 3 Epanechnikov Mean Shift deflation
Open Source Code No The paper does not provide an explicit statement about the release of source code or a link to a code repository for the methodology described.
Open Datasets No The experiments are conducted on a synthetic dataset whose generation process is described, but no public access information (link, DOI, formal citation) is provided for the dataset itself.
Dataset Splits No The paper describes generating synthetic data and repeating simulations, and mentions 'leave-one-out cross-validation' for tuning a parameter, but does not specify explicit training, validation, and test dataset splits with percentages or sample counts for the main evaluation setup.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory specifications) used for running the experiments.
Software Dependencies No The paper mentions that 'The experiment is conducted in MATLAB', but no specific version numbers for MATLAB or any other software dependencies are provided.
Experiment Setup Yes For d = 100, we prescribe K = 30 clusters (Gaussian components). For cluster k, we first randomly generate its centroid μk N(0, 4I), and then generate Mk = 50k i.i.d. data points from N(μk, I). The procedure is repeated 30 times. The only parameter (kernel bandwidth) is tuned by leave-one-out cross-validation.