reproducibilityindex.ai

Differentially Private k-Means with Constant Multiplicative Error

Authors: Uri Stemmer, Haim Kaplan

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We design new differentially private algorithms for the Euclidean k-means problem, both in the centralized model and in the local model of differential privacy. In both models, our algorithms achieve signiﬁcantly improved error guarantees than the previous state-of-the-art. In addition, in the local model, our algorithm signiﬁcantly reduces the number of interaction rounds. Although the problem has been widely studied in the context of differential privacy, all of the existing constructions achieve only super constant approximation factors. We present for the ﬁrst time efﬁcient private algorithms for the problem with constant multiplicative error.
Researcher Affiliation	Collaboration	Haim Kaplan Tel Aviv University and Google haimk@post.tau.ac.il Uri Stemmer Ben-Gurion University u@uri.co.il
Pseudocode	Yes	Algorithm Private-k-Means Input: Database S containing n points in the d-dimensional ball B(0, Λ), failure probability β, privacy parameters ε, δ. 1. Initiate C = , and denote S1 = S and n1 = n. 2. For i = 1 to log log n do (a) Run algorithm Private-Centers on the database Si with parameters ε log log n, δ log log n, β k , and add the returned set of centers to C. (b) Let Si+1 Si be a subset of Si containing ni+1 = 2(T + 1)wk n0.1 i points with the largest distance to the centers in C, where w = w(n, d, k, β, ε, δ) and T = T(n) will be speciﬁed in the analysis. 3. Output C.
Open Source Code	No	The paper does not provide any explicit statements about releasing source code or links to a code repository for the described methodology.
Open Datasets	No	The paper is theoretical and describes algorithms with mathematical guarantees, referring to abstract data points in Rd rather than specific public datasets. There is no mention of dataset names, links, or citations for publicly available data used for training.
Dataset Splits	No	The paper is theoretical and does not conduct empirical experiments, thus no training/validation/test dataset splits are mentioned.
Hardware Specification	No	The paper does not provide any specific details about the hardware used to run experiments. This is consistent with its theoretical nature.
Software Dependencies	No	The paper does not list specific software dependencies with version numbers.
Experiment Setup	No	The paper is theoretical and focuses on algorithm design and mathematical proofs; it does not include details such as hyperparameters, optimization settings, or specific training configurations for an experimental setup.