Differentially private subspace clustering

Authors: Yining Wang, Yu-Xiang Wang, Aarti Singh

NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate via both theory and experiments that one of the presented methods enjoys formal privacy and utility guarantees; the other one asymptotically preserves differential privacy while having good performance in practice. We provide numerical results of both the sample-aggregate and Gibbs sampling algorithms on synthetic and real-world datasets.
Researcher Affiliation Academia Yining Wang, Yu-Xiang Wang and Aarti Singh Machine Learning Department, Carnegie Mellon Universty, Pittsburgh, USA {yiningwa,yuxiangw,aarti}@cs.cmu.edu
Pseudocode Yes Algorithm 1 The sample-aggregate framework [22] Algorithm 2 Threshold-based subspace clustering (TSC), a simplified version
Open Source Code No The paper does not provide any explicit statements about the release of source code for the described methodology, nor does it include links to a code repository.
Open Datasets Yes We also experiment on real-world datasets. The right two plots in Figure 2 report utility on a subset of the extended Yale Face Dataset B [13] for face clustering.
Dataset Splits No The paper specifies dataset sizes (e.g., 'n = 5000' for synthetic, 'n = 320' for Yale Face Dataset B) but does not provide specific training, validation, or test split percentages or sample counts.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper states, 'All methods are implemented using Matlab,' but does not provide a specific version number for Matlab or any other software dependencies.
Experiment Setup Yes δ is set to 1/(n ln n) for (ε, δ)-privacy algorithms. s.a. stands for smooth sensitivity and exp. stands for exponential mechanism. Su LQ-10 and Su LQ-50 stand for the Su LQ framework performing 10 and 50 iterations. Gibbs sampling is run for 10000 iterations and the mean of the last 100 samples is reported.