A Differentially Private Clustering Algorithm for Well-Clustered Graphs

Authors: Weiqiang He, Hendrik Fichtenberger, Pan Peng

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct experimental evaluations on datasets with known ground truth clusters to substantiate the prowess of our algorithm. We also show that any (pure) ϵ-DP algorithm would result in substantial error.
Researcher Affiliation Collaboration 1School of Computer Science and Technology, University of Science and Technology of China 2Google Research
Pseudocode Yes Algorithm 1: Private Clustering
Open Source Code No The paper does not provide an explicit statement or link to open-source code for the described methodology.
Open Datasets No The paper samples synthetic datasets from a stochastic block model and does not provide a public link, DOI, or specific citation to make the generated datasets publicly available.
Dataset Splits No The paper mentions sampling SBM graphs and running algorithms multiple times but does not specify any training, validation, or test dataset splits.
Hardware Specification Yes The experiments were run on a single Google Cloud n2-standard-128 instance.
Software Dependencies Yes We model the SDP (2) in CVXPY 1.3.2 and use SCS 3.2.3 as SDP solver. We use NumPy 1.23.5 for numerical computation and scikit-learn 1.3 for an implementation of k-means++.
Experiment Setup Yes For our algorithm, we use b = (k 1)/k and λ = c q mϵ2 n log(2/δ), where c is a trade-off constant that we fix for each SBM parameterization before sampling the input datasets. For randomized response, we replace the objective with arg min X D LG, X , i.e., we remove the regularizer term, and set p RR = 1/(1+eϵ). For our experiments, we use ϵ = 1 and δ = 1/n2.