Robust k-means: a Theoretical Revisit

Authors: ALEXANDROS GEORGOGIANNIS

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this work, we present a theoretical analysis of the robustness and consistency properties of a variant of the classical quadratic k-means algorithm, the robust k-means... The synthetic data for the experiments come from a mixture of Gaussians... In Figures 2-3, we plot the results... The results for each scenario (accuracy, cluster estimation error, etc) are averages over 150 runs of the experiment.
Researcher Affiliation Academia Alexandros Georgogiannis School of Electrical and Computer Engineering Technical University of Crete, Greece alexandrosgeorgogiannis at gmail.com
Pseudocode No The paper describes algorithms and procedures in prose, but it does not include any structured pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper does not provide any concrete access information (e.g., specific repository link, explicit statement of code release) for the source code of the methodology described.
Open Datasets No The paper uses synthetic data generated for the experiments ('The synthetic data for the experiments come from a mixture of Gaussians with 10 components...'), but it does not provide concrete access information (link, DOI, formal citation) for a publicly available or open dataset.
Dataset Splits No The paper describes the synthetic data and experimental setup, but it does not specify explicit training, validation, or test dataset splits.
Hardware Specification No The paper does not provide any specific hardware details (e.g., GPU/CPU models, memory, or processor types) used for running its experiments.
Software Dependencies No The paper mentions 'the R package trimcluster [10]' and 'the R toolbox Mix Sim [14]', but it does not provide specific version numbers for these software components.
Experiment Setup Yes The parameter a in trimmed k-means (the percentage of outliers) is set to a = 0.3, while the value of the parameter λ for which (RKM) yields 150 outliers is found through a search over a grid on the set λ (0, λmax) (we set λmax as the maximum distance between two points in a dataset)... In Figures 2-3, we plot the results for a proximal map Pf like the one in (16) with h(x) = αx and α = 0.005.