Near-Optimal Algorithms for Explainable k-Medians and k-Means
Authors: Konstantin Makarychev, Liren Shan
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We propose a new algorithm for this problem which is O(log k) competitive with k-medians with ℓ1 norm and O(k) competitive with k-means. This is an improvement over the previous guarantees of O(k) and O(k2) by Dasgupta et al (2020). We also provide a new algorithm which is O(log 3/2 k) competitive for k-medians with ℓ2 norm. Our first algorithm is near-optimal: Dasgupta et al (2020) showed a lower bound of Ω(log k) for k-medians; in this work, we prove a lower bound of Ω(k) for k-means. We also provide a lower bound of Ω(log k) for k-medians with ℓ2 norm. |
| Researcher Affiliation | Academia | Konstantin Makarychev * 1 Liren Shan * 1 1Northwestern University, Evanston, IL, USA. |
| Pseudocode | Yes | Algorithm 1 Threshold tree construction for k-medians in ℓ1 |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code or provide a link to a code repository. |
| Open Datasets | No | The paper is theoretical and does not involve the use of datasets for training. Therefore, it does not mention public dataset access information. |
| Dataset Splits | No | The paper is theoretical and focuses on algorithm design and proofs, not empirical evaluation requiring dataset splits for training, validation, or testing. |
| Hardware Specification | No | The paper is theoretical and presents algorithms and their competitive analyses. It does not describe any experiments that would require specific hardware, therefore no hardware specifications are mentioned. |
| Software Dependencies | No | The paper is theoretical, presenting algorithms and mathematical proofs. It does not mention any specific software dependencies with version numbers required for replication. |
| Experiment Setup | No | The paper is theoretical, focusing on algorithm design and analysis. It does not describe any empirical experiments that would require details on hyperparameters, training configurations, or system-level settings. |