Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Clustering what Matters: Optimal Approximation for Clustering with Outliers

Authors: Akanksha Agrawal, Tanmay Inamdar, Saket Saurabh, Jie Xue

JAIR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this paper, we give a general approach for solving clustering with outliers, which results in a fixed-parameter tractable (FPT) algorithm in k and m i.e., an algorithm with running time of the form f(k, m) n O(1) for some function f that almost matches the approximation ratio for its outlier-free counterpart. As a corollary, we obtain FPT approximation algorithms with optimal approximation ratios for k-Median and k Means with outliers in general and Euclidean metrics.
Researcher Affiliation Academia Akanksha Agrawal EMAIL Indian Institute Of Technology, Chennai 600036, India Tanmay Inamdar EMAIL University of Bergen, Bergen 5007, Norway Saket Saurabh EMAIL Institute of Mathematical Sciences, 4th Cross St, Chennai 600113, India University of Bergen, Bergen 5007, Norway Jie Xue EMAIL NYU Shanghai, 1555 Shiji Blvd, Pudong, Shanghai, China, 200122a
Pseudocode No Figure 1: A conceptual overview of our approximation-preserving reduction for k Median Out to k-Median. Here, A denotes a β-approximation algorithm for k-Median, which we use as a black box on each of the instances I1, . . . , It of k-Median.
Open Source Code No The paper does not provide any statement or link regarding the availability of source code for the methodology described.
Open Datasets No The paper describes theoretical algorithms and their approximation ratios for clustering problems. It does not present any experimental evaluation using specific datasets or provide access information for open datasets.
Dataset Splits No The paper does not present experimental results or use specific datasets, therefore, there is no mention of training/test/validation splits.
Hardware Specification No The paper focuses on theoretical algorithm design and analysis, and thus does not describe the hardware used for experimental evaluation.
Software Dependencies No The paper focuses on theoretical algorithm design and analysis, and thus does not mention any specific software dependencies with version numbers.
Experiment Setup No The paper is theoretical in nature and does not describe an experimental setup, hyperparameters, or training configurations.