A Constant-Factor Bi-Criteria Approximation Guarantee for k-means++
Authors: Dennis Wei
NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | This paper studies the k-means++ algorithm for clustering as well as the class of Dℓ sampling algorithms to which k-means++ belongs. It is shown that for any constant factor β > 1, selecting βk cluster centers by Dℓsampling yields a constant-factor approximation to the optimal clustering with k centers, in expectation and without conditions on the dataset. This result extends the previously known O(log k) guarantee for the case β = 1 to the constant-factor bi-criteria regime. It also improves upon an existing constant-factor bi-criteria result that holds only with constant probability. |
| Researcher Affiliation | Industry | Dennis Wei IBM Research Yorktown Heights, NY 10598, USA dwei@us.ibm.com |
| Pseudocode | Yes | Algorithm 1 DℓSampling Input: Data points x1, . . . , xn, number of clusters t. Initialize φ(xi) = 1 for i = 1, . . . , n. for j = 1 to t do Select jth center cj = xi with probability φ(xi)/φ. Update φ(xi) for i = 1, . . . , n. |
| Open Source Code | No | The paper is theoretical and does not mention providing open-source code for the described methodology. |
| Open Datasets | No | The paper is theoretical and does not mention the use of any specific publicly available datasets for training or provide access information. |
| Dataset Splits | No | The paper is theoretical and does not describe any specific training/test/validation dataset splits. |
| Hardware Specification | No | The paper is theoretical and does not mention any specific hardware used for running experiments. |
| Software Dependencies | No | The paper is theoretical and does not mention any specific software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and does not describe any experimental setup details, such as hyperparameters or system-level training settings. |