reproducibilityindex.ai

Near-Optimal Private and Scalable $k$-Clustering

Authors: Vincent Cohen-Addad, Alessandro Epasto, Vahab Mirrokni, Shyam Narayanan, Peilin Zhong

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We study the differentially private (DP) k-means and k-median clustering problems of n points in d-dimensional Euclidean space in the massively parallel computation (MPC) model. We provide two near-optimal algorithms where the near-optimality is in three aspects: they both achieve (1). O(1) parallel computation rounds, (2). near-linear in n and polynomial in k total computational work (i.e., near-linear running time when n is a sufﬁcient polynomial in k), (3). O(1) relative approximation and poly(k, d) additive error.
Researcher Affiliation	Collaboration	Vincent Cohen-Addad1 Alessandro Epasto1 Vahab Mirrokni1 Shyam Narayanan2 Peilin Zhong1 1Google Research 2MIT {cohenaddad,aepasto,mirrokni,peilinz}@google.com shyamsn@mit.edu
Pseudocode	Yes	We present the algorithm pseudocode in below in Algorithm 1, and defer the full analysis to Appendix C.
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described. There is no mention of a code repository link, an explicit code release statement, or code being available in supplementary materials.
Open Datasets	No	The paper focuses on theoretical algorithm design and analysis. It does not conduct empirical studies with datasets, and therefore does not provide concrete access information for a publicly available or open dataset.
Dataset Splits	No	The paper focuses on theoretical algorithm design and analysis. It does not conduct empirical studies, and therefore does not provide specific dataset split information (e.g., training, validation, test percentages or counts).
Hardware Specification	No	The paper focuses on theoretical algorithm design and analysis. It does not describe any empirical experiments or their setup, and therefore does not provide specific hardware details (like GPU/CPU models or memory amounts) used for running experiments.
Software Dependencies	No	The paper focuses on theoretical algorithm design and analysis. It does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate experiments.
Experiment Setup	No	The paper focuses on theoretical algorithm design and analysis. It does not describe any empirical experiments or their setup, and therefore does not provide specific experimental setup details like hyperparameter values or training configurations.