Sublinear Algorithms for Hierarchical Clustering

Authors: Arpit Agarwal, Sanjeev Khanna, Huan Li, Prathamesh Patil

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We design sublinear algorithms for hierarchical clustering in all three models above. At the heart of our algorithmic results is a view of the objective in terms of cuts in the graph, which allows us to use a relaxed notion of cut sparsifiers to do hierarchical clustering while introducing only a small distortion in the objective function. Our main algorithmic contributions are then to show how cut sparsifiers of the desired form can be efficiently constructed in the query model and the MPC model. We believe this relaxed notion of cut sparsifiers may be of broader interest. We complement our algorithmic results by establishing nearly matching lower bounds that rule out the possibility of designing algorithms with better performance guarantees in each of these models.
Researcher Affiliation Academia 1Data Science Institute, Columbia University 2Department of Computer and Information Science, University of Pennsylvania {arpit.agarwal}@columbia.edu {sanjeev,huanli,pprath}@cis.upenn.edu
Pseudocode No The paper describes algorithms conceptually and presents theorems and proofs, but it does not include any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any information or links regarding open-source code for the methodology described.
Open Datasets No This is a theoretical paper focused on algorithm design, proofs, and complexity analysis (upper and lower bounds) in specific computational models. It does not involve training models on datasets.
Dataset Splits No This is a theoretical paper focused on algorithm design, proofs, and complexity analysis (upper and lower bounds) in specific computational models. It does not involve validation sets for empirical evaluation.
Hardware Specification No This is a theoretical paper. It focuses on algorithm design and analysis rather than empirical experiments that would require specific hardware specifications.
Software Dependencies No This is a theoretical paper. It focuses on algorithm design and analysis and does not specify any software dependencies with version numbers for experimental reproducibility.
Experiment Setup No This is a theoretical paper. It focuses on algorithm design and analysis and does not include details about an experimental setup, hyperparameters, or training configurations.