FriendlyCore: Practical Differentially Private Aggregation

Authors: Eliad Tsfadia, Edith Cohen, Haim Kaplan, Yishay Mansour, Uri Stemmer

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically demonstrate its advantages in boosting the accuracy of mean estimation and clustering tasks such as k-means and k-GMM, outperforming tailored methods.
Researcher Affiliation Collaboration Eliad Tsfadia 1 2 Edith Cohen 1 2 Haim Kaplan 2 1 Yishay Mansour 2 1 Uri Stemmer 2 1 1Google Research 2Blavatnik School of Computer Science, Tel Aviv University.
Pseudocode Yes Algorithm 4.1 (Friendly Core) ... Algorithm 5.1 (FC Avg) ... Algorithm 5.3 (FC Clustering, informal)
Open Source Code No The paper states: 'The implementations of Coin Press, and the experimental test bed, were taken from the publicly available code of (Biswas et al., 2020) provided at https://github.com/twistedcubic/coin-press.' This refers to a competitor's code, not the authors' own implementation of Friendly Core.
Open Datasets Yes In Figure 4(4) we used the publicly available dataset of (Fonollosa & Huerta, 2015) that contains the acquired time series from 16 chemical gas sensors exposed to gas mixtures at varying concentration levels.
Dataset Splits No The paper describes generating synthetic datasets and using repetitions for experiments (e.g., '50 repetitions of each experiment' or '30 repetitions of each experiment'), but it does not specify explicit training, validation, and testing dataset splits or mention cross-validation strategies for reproducibility.
Hardware Specification Yes In all experiments we used privacy parameter ρ = 1, δ = 10 8, and all of them were tested on a Mac Book Pro Laptop with 4-core Intel i7 CPU with 2.8GHz, and with 16GB RAM.
Software Dependencies No The paper mentions 'Python implementation' for their algorithms and 'the KMeans algorithm of the Python library sklearn' for their clustering oracle. However, it does not specify exact version numbers for Python or any libraries like scikit-learn, which is necessary for reproducible software dependency information.
Experiment Setup Yes In all experiments we used privacy parameter ρ = 1, δ = 10 8... For FC Clustering we used an oracle access to k-means++ provided by the KMeans algorithm of the Python library sklearn, and used rmin = 0.001 and radius Λ = 1. We set the radius parameter of LSH Clustering to 1.