Fast k-means with accurate bounds

Authors: James Newling, Francois Fleuret

ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We compare 23 k-means implementations, including our own implementations of all algorithms described, original implementations accompanying the papers (Hamerly, 2010; Drake, 2013; Ding et al., 2015), and implementations in two popular machine learning libraries, VLFeat and mlpack. We use the following notation to refer to implementations: {codesource-algorithm}, where codesource is one of bay (Hamerly, 2015), mlp (Curtin et al., 2013), pow (Low et al., 2010), vlf (Vedaldi & Fulkerson, 2008) and own (our own code), and algorithm is one of the algorithms described.
Researcher Affiliation Academia James Newling JAMES.NEWLING@IDIAP.CH Idiap Research Institute & EPFL, Switzerland François Fleuret FRANCOIS.FLEURET@IDIAP.CH Idiap Research Institute & EPFL, Switzerland
Pseudocode No The paper describes algorithms conceptually and mathematically but does not include structured pseudocode or algorithm blocks.
Open Source Code Yes Fully parallelised implementations of all algorithms are provided under an open-source license at https://github.com/idiap/eakmeans
Open Datasets Yes Table 1. The 22 datasets used in experiments, ranging in dimension from 2 to 784. The datasets come from: the UCI, KDD and KEEL repositories (11,2,2), MNIST and STL-10 image databases (2,1), random (2), European Bioinformatics Institute (1) and Joensuu University (1). Full names and further details in D.
Dataset Splits No The paper does not provide specific details on training, validation, or test dataset splits (e.g., percentages, sample counts, or methodology for partitioning the data).
Hardware Specification Yes All experiments are performed using double precision floating point numbers. We compare 23 k-means implementations... on a machine with an Intel i7 processor and 8MB of cache memory.
Software Dependencies No The paper mentions 'C++11 thread support library' and external libraries like VLFeat, mlpack, and Open BLAS, but it does not provide specific version numbers for these or other key software components used in their own implementation.
Experiment Setup No The paper describes algorithmic details but does not provide specific experimental setup details such as hyperparameter values (e.g., learning rate, batch size, epochs), optimizer settings, or detailed training configurations.