Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Near-Optimal Algorithms for Explainable k-Medians and k-Means

Authors: Konstantin Makarychev, Liren Shan

ICML 2021 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We propose a new algorithm for this problem which is O(log k) competitive with k-medians with β„“1 norm and O(k) competitive with k-means. This is an improvement over the previous guarantees of O(k) and O(k2) by Dasgupta et al (2020). We also provide a new algorithm which is O(log 3/2 k) competitive for k-medians with β„“2 norm. Our first algorithm is near-optimal: Dasgupta et al (2020) showed a lower bound of Ω(log k) for k-medians; in this work, we prove a lower bound of Ω(k) for k-means. We also provide a lower bound of Ω(log k) for k-medians with β„“2 norm.
Researcher Affiliation Academia Konstantin Makarychev * 1 Liren Shan * 1 1Northwestern University, Evanston, IL, USA.
Pseudocode Yes Algorithm 1 Threshold tree construction for k-medians in β„“1
Open Source Code No The paper does not contain any explicit statement about releasing source code or provide a link to a code repository.
Open Datasets No The paper is theoretical and does not involve the use of datasets for training. Therefore, it does not mention public dataset access information.
Dataset Splits No The paper is theoretical and focuses on algorithm design and proofs, not empirical evaluation requiring dataset splits for training, validation, or testing.
Hardware Specification No The paper is theoretical and presents algorithms and their competitive analyses. It does not describe any experiments that would require specific hardware, therefore no hardware specifications are mentioned.
Software Dependencies No The paper is theoretical, presenting algorithms and mathematical proofs. It does not mention any specific software dependencies with version numbers required for replication.
Experiment Setup No The paper is theoretical, focusing on algorithm design and analysis. It does not describe any empirical experiments that would require details on hyperparameters, training configurations, or system-level settings.