Differentially Private k-Means via Exponential Mechanism and Max Cover
Authors: Huy L. Nguyen, Anamay Chaturvedi, Eric Z Xu9101-9108
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conclude with some experiments and find an improvement over previously implemented work for this problem. [...] We present an experimental comparison between algorithm 12, the private k-means clustering algorithm from Balcan et al. (2017), and the non-private Lloyd s algorithm. [...] Figure 1: Empirical comparison of algorithm 1 and the private k-means clustering algorithm from (Balcan et al. 2017). |
| Researcher Affiliation | Academia | Huy L. Nguy ên , Anamay Chaturvedi , Eric Z Xu Khoury College of Computer Sciences, Northeastern University 440 Huntington Ave Boston, Massachusetts 02115 |
| Pseudocode | Yes | Algorithm 1: Private k-means [...] Algorithm 2: Private grid set cover [...] Algorithm 3: Noisy AVG |
| Open Source Code | Yes | The code used for our experiments is available at https://github. com/Anamay-Chaturvedi/Differentially-private-k-means |
| Open Datasets | Yes | Two datasets are used; a synthetic dataset reproducing the construction in Balcan et al. (2017) and the MNIST training dataset (Lecun et al. 1998). The MNIST training dataset uses the raw pixels; it is comprised of 60,000 points with 784 features each. |
| Dataset Splits | No | The paper mentions using the MNIST training dataset and a synthetic dataset, but it does not specify any training, validation, or test splits (e.g., percentages, sample counts, or references to predefined splits). |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. It only mentions general experimental setup. |
| Software Dependencies | No | The paper mentions using the Laplace mechanism and replacing it with the Noisy AVG routine, but it does not provide specific version numbers for any software dependencies or libraries (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | Implementation details: We set ϵ = 1 and δ = n 1.5 for both algorithms. Similar to Balcan et al. (2017), we project to a smaller subspace of dimension log(n)/2 rather than O(log(n)/ϵ2) this does not affect privacy. [...] Lloyd s algorithm was executed with 10 iterations. |