Fast k-means with accurate bounds
Authors: James Newling, Francois Fleuret
ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We compare 23 k-means implementations, including our own implementations of all algorithms described, original implementations accompanying the papers (Hamerly, 2010; Drake, 2013; Ding et al., 2015), and implementations in two popular machine learning libraries, VLFeat and mlpack. We use the following notation to refer to implementations: {codesource-algorithm}, where codesource is one of bay (Hamerly, 2015), mlp (Curtin et al., 2013), pow (Low et al., 2010), vlf (Vedaldi & Fulkerson, 2008) and own (our own code), and algorithm is one of the algorithms described. |
| Researcher Affiliation | Academia | James Newling JAMES.NEWLING@IDIAP.CH Idiap Research Institute & EPFL, Switzerland François Fleuret FRANCOIS.FLEURET@IDIAP.CH Idiap Research Institute & EPFL, Switzerland |
| Pseudocode | No | The paper describes algorithms conceptually and mathematically but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Fully parallelised implementations of all algorithms are provided under an open-source license at https://github.com/idiap/eakmeans |
| Open Datasets | Yes | Table 1. The 22 datasets used in experiments, ranging in dimension from 2 to 784. The datasets come from: the UCI, KDD and KEEL repositories (11,2,2), MNIST and STL-10 image databases (2,1), random (2), European Bioinformatics Institute (1) and Joensuu University (1). Full names and further details in D. |
| Dataset Splits | No | The paper does not provide specific details on training, validation, or test dataset splits (e.g., percentages, sample counts, or methodology for partitioning the data). |
| Hardware Specification | Yes | All experiments are performed using double precision floating point numbers. We compare 23 k-means implementations... on a machine with an Intel i7 processor and 8MB of cache memory. |
| Software Dependencies | No | The paper mentions 'C++11 thread support library' and external libraries like VLFeat, mlpack, and Open BLAS, but it does not provide specific version numbers for these or other key software components used in their own implementation. |
| Experiment Setup | No | The paper describes algorithmic details but does not provide specific experimental setup details such as hyperparameter values (e.g., learning rate, batch size, epochs), optimizer settings, or detailed training configurations. |