On Differentially Private Subspace Estimation in a Distribution-Free Setting

Authors: Eliad Tsfadia

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In Section 4 we empirically compared our method to the additive-gap based approach of Dwork et al. [2014] for the task of privately estimating the empirical mean of points that are close to a small dimensional subspace, demonstrating the advantage of our approach in high-dimensional regimes.
Researcher Affiliation Academia Eliad Tsfadia Department of Computer Science Georgetown University eliadtsfadia@gmail.com
Pseudocode Yes Algorithm C.1 (Algorithm Est Subspace). Input: A dataset X = (x1, . . . , xn) (Sd)n. Parameters: k, t. Oracle: A DP algorithm Agg for aggregating projection matrices. Operation: ... (This is an example, there are many such algorithm blocks)
Open Source Code No The paper mentions the source code for the "Friendly Core-based averaging algorithm of Tsfadia et al. [2022]" is publicly available, but it does not provide an explicit statement or link for the source code of the methodology described in this paper.
Open Datasets No In order to generate a synthetic dataset that approximately lie in a k-dimensional subspace, we initially sample uniformly random b1, . . . , bk { 1, 1}d and perform the following process to generate each data point: (i) Sample a random unit vector u in Span{b1, . . . , bk}, (ii) Sample a random noise vector ν {1/τ, 1/τ}d, and (iii) Output u+ν u+ν
Dataset Splits No The paper describes the generation of a synthetic dataset and then directly uses it for experiments and comparisons, but it does not specify explicit training, validation, and test dataset splits with percentages or sample counts.
Hardware Specification Yes All experiments were tested on a Mac Book Pro Laptop with 8-core Apple M1 CPU with 16GB RAM.
Software Dependencies No The paper mentions implementing the algorithm in "Python" and using the "sklearn library" (randomized_svd function), but it does not specify version numbers for these software components.
Experiment Setup Yes In all our experiments, we use ρ = 2 and δ = 10 5, t = 125 (the number of subsets in the sampleand-aggregate process), n = 2tk data points, q = 10 k (the number of reference points in the aggregation), and use the z CDP implementation of the Friendly Core-based averaging algorithm of Tsfadia et al. [2022].