Approximation Scheme for Weighted Metric Clustering via Sherali-Adams
Authors: Dmitrii Avdiukhin, Vaggos Chatziafratis, Konstantin Makarychev, Grigory Yaroslavtsev
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In Section 6, we introduce a practical version of our algorithm based on LPII, which provides a constant-factor approximation to objective ( ). We run our experiments on 104 data points and show that our hierarchical clustering objective recovers a groundtruth clustering better compared to the popular Dasgupta s objective (Dasgupta 2016). Datasets We perform evaluation on various hierarchical datasets. In this section, we present experiments on random subsamples (of sizes 102, 103, and 104) of a well-known 20 NEWSGROUPS dataset (Lang 1995) |
| Researcher Affiliation | Academia | 1Northwestern University, Illinois 2University of California at Santa Cruz, California 3George Mason University, Virginia |
| Pseudocode | Yes | Algorithm 1: PTAS for Weighted Metric Clustering. Algorithm 2: MAKEINDEPENDENT({P}, {Di}, δ, γ) |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code for the described methodology or links to code repositories. |
| Open Datasets | Yes | We present experiments on random subsamples (of sizes 102, 103, and 104) of a well-known 20 NEWSGROUPS dataset (Lang 1995), and in Appendix G we present additional experiments on ZEBRAFISH (Wagner et al. 2018), CIFAR-10 (Krizhevsky and Hinton 2009), and other datasets. |
| Dataset Splits | No | The paper mentions using random subsamples of datasets and evaluating against ground-truth clustering. However, it does not specify any training, validation, or test dataset splits (e.g., percentages, sample counts, or predefined splits) for reproducibility. |
| Hardware Specification | Yes | The experiments are performed on a single-core Intel Xeon 2.2GHz CPU. |
| Software Dependencies | No | The paper mentions using 'METIS (Karypis and Kumar 1995)' but does not provide a specific version number for this or any other software dependency. |
| Experiment Setup | Yes | For our experiments, we use a simplified version of the algorithm, based on the LPII relaxation from Section 4, which achieves 3-approximation (Appendix B): u V nj Aij duci j Pu u Ci , where n1, . . . , nk are cluster cardinalities and c1, . . . , ck are cluster centers. This objective can be optimized efficiently as an instance of a minimum-cost flow problem, while precisely satisfying the imposed cardinality constraints. We run the algorithm multiple times with different guesses of {ni} and {ci}, and, since the guesses might be not precise, we improve the resulting solution using local search. |