The Parameterized Complexity of Clustering Incomplete Data

Authors: Eduard Eiben, Robert Ganian, Iyad Kanj, Sebastian Ordyniak, Stefan Szeider7296-7304

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We give tight characterizations of the parameterized complexity of these problems with respect to the parameters k, r, and the minimum number of rows and columns needed to cover all the missing entries. We show that the considered problems are fixed-parameter tractable when parameterized by the three parameters combined, and that dropping any of the three parameters results in parameterized intractability. A byproduct of our results is that, for the complete data setting, all problems under consideration are fixed-parameter tractable parameterized by k + r. Our main algorithmic contribution shows that the incomplete variants of all three clustering problems are fixed-parameter tractable parameterized by k + r + cover, and as a consequence the complete variants are fixed-parameter tractable parameterized by k + r. Notably, our tractability results are obtained using kernelization (Fomin et al. 2019; Gaspers and Szeider 2014) and therefore provide efficient polynomial-time preprocessing procedures, which can be applied before the application of any available (even heuristic) clustering algorithm.
Researcher Affiliation Academia Eduard Eiben,1 Robert Ganian,2 Iyad Kanj,3 Sebastian Ordyniak,4 Stefan Szeider2 1Department of Computer Science, Royal Holloway, University of London, Egham, UK 2Algorithms and Complexity Group, TU Wien, Vienna, Austria 3School of Computing, De Paul University, Chicago, USA 4University of Leeds, School of Computing, Leeds, UK
Pseudocode No The paper describes algorithmic techniques but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured steps formatted like code.
Open Source Code No The paper does not contain any statements about releasing open-source code, nor does it provide links to code repositories.
Open Datasets No This paper is theoretical and does not conduct experiments on datasets, therefore it does not discuss public dataset availability.
Dataset Splits No This paper is theoretical and does not involve experimental evaluation with data, thus it does not specify training, validation, or test dataset splits.
Hardware Specification No This paper is theoretical and does not describe experimental procedures that would require specific hardware, thus no hardware specifications are provided.
Software Dependencies No This paper is theoretical and does not describe experimental implementations requiring specific software dependencies with version numbers.
Experiment Setup No This paper focuses on theoretical complexity analysis and does not include details on experimental setups or hyperparameters.