Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Differentially private subspace clustering
Authors: Yining Wang, Yu-Xiang Wang, Aarti Singh
NeurIPS 2015 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate via both theory and experiments that one of the presented methods enjoys formal privacy and utility guarantees; the other one asymptotically preserves differential privacy while having good performance in practice. We provide numerical results of both the sample-aggregate and Gibbs sampling algorithms on synthetic and real-world datasets. |
| Researcher Affiliation | Academia | Yining Wang, Yu-Xiang Wang and Aarti Singh Machine Learning Department, Carnegie Mellon Universty, Pittsburgh, USA EMAIL |
| Pseudocode | Yes | Algorithm 1 The sample-aggregate framework [22] Algorithm 2 Threshold-based subspace clustering (TSC), a simplified version |
| Open Source Code | No | The paper does not provide any explicit statements about the release of source code for the described methodology, nor does it include links to a code repository. |
| Open Datasets | Yes | We also experiment on real-world datasets. The right two plots in Figure 2 report utility on a subset of the extended Yale Face Dataset B [13] for face clustering. |
| Dataset Splits | No | The paper specifies dataset sizes (e.g., 'n = 5000' for synthetic, 'n = 320' for Yale Face Dataset B) but does not provide specific training, validation, or test split percentages or sample counts. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper states, 'All methods are implemented using Matlab,' but does not provide a specific version number for Matlab or any other software dependencies. |
| Experiment Setup | Yes | δ is set to 1/(n ln n) for (ε, δ)-privacy algorithms. s.a. stands for smooth sensitivity and exp. stands for exponential mechanism. Su LQ-10 and Su LQ-50 stand for the Su LQ framework performing 10 and 50 iterations. Gibbs sampling is run for 10000 iterations and the mean of the last 100 samples is reported. |