Approximation Scheme for Weighted Metric Clustering via Sherali-Adams

Authors: Dmitrii Avdiukhin, Vaggos Chatziafratis, Konstantin Makarychev, Grigory Yaroslavtsev

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In Section 6, we introduce a practical version of our algorithm based on LPII, which provides a constant-factor approximation to objective ( ). We run our experiments on 104 data points and show that our hierarchical clustering objective recovers a groundtruth clustering better compared to the popular Dasgupta s objective (Dasgupta 2016). Datasets We perform evaluation on various hierarchical datasets. In this section, we present experiments on random subsamples (of sizes 102, 103, and 104) of a well-known 20 NEWSGROUPS dataset (Lang 1995)
Researcher Affiliation Academia 1Northwestern University, Illinois 2University of California at Santa Cruz, California 3George Mason University, Virginia
Pseudocode Yes Algorithm 1: PTAS for Weighted Metric Clustering. Algorithm 2: MAKEINDEPENDENT({P}, {Di}, δ, γ)
Open Source Code No The paper does not provide any explicit statements about releasing source code for the described methodology or links to code repositories.
Open Datasets Yes We present experiments on random subsamples (of sizes 102, 103, and 104) of a well-known 20 NEWSGROUPS dataset (Lang 1995), and in Appendix G we present additional experiments on ZEBRAFISH (Wagner et al. 2018), CIFAR-10 (Krizhevsky and Hinton 2009), and other datasets.
Dataset Splits No The paper mentions using random subsamples of datasets and evaluating against ground-truth clustering. However, it does not specify any training, validation, or test dataset splits (e.g., percentages, sample counts, or predefined splits) for reproducibility.
Hardware Specification Yes The experiments are performed on a single-core Intel Xeon 2.2GHz CPU.
Software Dependencies No The paper mentions using 'METIS (Karypis and Kumar 1995)' but does not provide a specific version number for this or any other software dependency.
Experiment Setup Yes For our experiments, we use a simplified version of the algorithm, based on the LPII relaxation from Section 4, which achieves 3-approximation (Appendix B): u V nj Aij duci j Pu u Ci , where n1, . . . , nk are cluster cardinalities and c1, . . . , ck are cluster centers. This objective can be optimized efficiently as an instance of a minimum-cost flow problem, while precisely satisfying the imposed cardinality constraints. We run the algorithm multiple times with different guesses of {ni} and {ci}, and, since the guesses might be not precise, we improve the resulting solution using local search.