Clustering with Same-Cluster Queries
Authors: Hassan Ashtiani, Shrinu Kushagra, Shai Ben-David
NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We study the query and computational complexity of clustering in this framework. We consider a setting where the expert conforms to a center-based clustering with a notion of margin. We show that there is a trade off between computational complexity and query complexity; We prove that for the case of k-means clustering (i.e., when the expert conforms to a solution of k-means), having access to relatively few such queries allows efficient solutions to otherwise NP hard problems. In particular, we provide a probabilistic polynomial-time (BPP) algorithm for clustering in this setting that asks O k2 log k + k log n) same-cluster queries and runs with time complexity O kn log n). |
| Researcher Affiliation | Academia | Hassan Ashtiani , Shrinu Kushagra and Shai Ben-David David R. Cheriton School of Computer Science University of Waterloo, Waterloo, Ontario, Canada {mhzokaei,skushagr,shai}@uwaterloo.ca |
| Pseudocode | Yes | Algorithm 1: Algorithm for γ(> 1)-margin instances with queries |
| Open Source Code | No | The paper does not provide any concrete access to source code for the methodology described. |
| Open Datasets | No | The paper is theoretical and focuses on algorithm design and complexity analysis; it does not describe experiments using publicly available datasets or provide access information for any dataset used for training. |
| Dataset Splits | No | The paper is theoretical and does not mention any training, validation, or test dataset splits. |
| Hardware Specification | No | The paper is theoretical and does not describe the hardware used to run experiments. |
| Software Dependencies | No | The paper is theoretical and does not list specific software dependencies with version numbers. |
| Experiment Setup | No | The paper describes an algorithm theoretically but does not provide specific experimental setup details like hyperparameters or training configurations for an empirical study. |