The Parameterized Complexity of Clustering Incomplete Data
Authors: Eduard Eiben, Robert Ganian, Iyad Kanj, Sebastian Ordyniak, Stefan Szeider7296-7304
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We give tight characterizations of the parameterized complexity of these problems with respect to the parameters k, r, and the minimum number of rows and columns needed to cover all the missing entries. We show that the considered problems are fixed-parameter tractable when parameterized by the three parameters combined, and that dropping any of the three parameters results in parameterized intractability. A byproduct of our results is that, for the complete data setting, all problems under consideration are fixed-parameter tractable parameterized by k + r. Our main algorithmic contribution shows that the incomplete variants of all three clustering problems are fixed-parameter tractable parameterized by k + r + cover, and as a consequence the complete variants are fixed-parameter tractable parameterized by k + r. Notably, our tractability results are obtained using kernelization (Fomin et al. 2019; Gaspers and Szeider 2014) and therefore provide efficient polynomial-time preprocessing procedures, which can be applied before the application of any available (even heuristic) clustering algorithm. |
| Researcher Affiliation | Academia | Eduard Eiben,1 Robert Ganian,2 Iyad Kanj,3 Sebastian Ordyniak,4 Stefan Szeider2 1Department of Computer Science, Royal Holloway, University of London, Egham, UK 2Algorithms and Complexity Group, TU Wien, Vienna, Austria 3School of Computing, De Paul University, Chicago, USA 4University of Leeds, School of Computing, Leeds, UK |
| Pseudocode | No | The paper describes algorithmic techniques but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured steps formatted like code. |
| Open Source Code | No | The paper does not contain any statements about releasing open-source code, nor does it provide links to code repositories. |
| Open Datasets | No | This paper is theoretical and does not conduct experiments on datasets, therefore it does not discuss public dataset availability. |
| Dataset Splits | No | This paper is theoretical and does not involve experimental evaluation with data, thus it does not specify training, validation, or test dataset splits. |
| Hardware Specification | No | This paper is theoretical and does not describe experimental procedures that would require specific hardware, thus no hardware specifications are provided. |
| Software Dependencies | No | This paper is theoretical and does not describe experimental implementations requiring specific software dependencies with version numbers. |
| Experiment Setup | No | This paper focuses on theoretical complexity analysis and does not include details on experimental setups or hyperparameters. |