Differentially Private k-Means via Exponential Mechanism and Max Cover

Authors: Huy L. Nguyen, Anamay Chaturvedi, Eric Z Xu9101-9108

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conclude with some experiments and find an improvement over previously implemented work for this problem. [...] We present an experimental comparison between algorithm 12, the private k-means clustering algorithm from Balcan et al. (2017), and the non-private Lloyd s algorithm. [...] Figure 1: Empirical comparison of algorithm 1 and the private k-means clustering algorithm from (Balcan et al. 2017).
Researcher Affiliation Academia Huy L. Nguy ên , Anamay Chaturvedi , Eric Z Xu Khoury College of Computer Sciences, Northeastern University 440 Huntington Ave Boston, Massachusetts 02115
Pseudocode Yes Algorithm 1: Private k-means [...] Algorithm 2: Private grid set cover [...] Algorithm 3: Noisy AVG
Open Source Code Yes The code used for our experiments is available at https://github. com/Anamay-Chaturvedi/Differentially-private-k-means
Open Datasets Yes Two datasets are used; a synthetic dataset reproducing the construction in Balcan et al. (2017) and the MNIST training dataset (Lecun et al. 1998). The MNIST training dataset uses the raw pixels; it is comprised of 60,000 points with 784 features each.
Dataset Splits No The paper mentions using the MNIST training dataset and a synthetic dataset, but it does not specify any training, validation, or test splits (e.g., percentages, sample counts, or references to predefined splits).
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. It only mentions general experimental setup.
Software Dependencies No The paper mentions using the Laplace mechanism and replacing it with the Noisy AVG routine, but it does not provide specific version numbers for any software dependencies or libraries (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes Implementation details: We set ϵ = 1 and δ = n 1.5 for both algorithms. Similar to Balcan et al. (2017), we project to a smaller subspace of dimension log(n)/2 rather than O(log(n)/ϵ2) this does not affect privacy. [...] Lloyd s algorithm was executed with 10 iterations.