reproducibilityindex.ai

On Differentially Private Subspace Estimation in a Distribution-Free Setting

Authors: Eliad Tsfadia

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In Section 4 we empirically compared our method to the additive-gap based approach of Dwork et al. [2014] for the task of privately estimating the empirical mean of points that are close to a small dimensional subspace, demonstrating the advantage of our approach in high-dimensional regimes.
Researcher Affiliation	Academia	Eliad Tsfadia Department of Computer Science Georgetown University eliadtsfadia@gmail.com
Pseudocode	Yes	Algorithm C.1 (Algorithm Est Subspace). Input: A dataset X = (x1, . . . , xn) (Sd)n. Parameters: k, t. Oracle: A DP algorithm Agg for aggregating projection matrices. Operation: ... (This is an example, there are many such algorithm blocks)
Open Source Code	No	The paper mentions the source code for the "Friendly Core-based averaging algorithm of Tsfadia et al. [2022]" is publicly available, but it does not provide an explicit statement or link for the source code of the methodology described in this paper.
Open Datasets	No	In order to generate a synthetic dataset that approximately lie in a k-dimensional subspace, we initially sample uniformly random b1, . . . , bk { 1, 1}d and perform the following process to generate each data point: (i) Sample a random unit vector u in Span{b1, . . . , bk}, (ii) Sample a random noise vector ν {1/τ, 1/τ}d, and (iii) Output u+ν u+ν
Dataset Splits	No	The paper describes the generation of a synthetic dataset and then directly uses it for experiments and comparisons, but it does not specify explicit training, validation, and test dataset splits with percentages or sample counts.
Hardware Specification	Yes	All experiments were tested on a Mac Book Pro Laptop with 8-core Apple M1 CPU with 16GB RAM.
Software Dependencies	No	The paper mentions implementing the algorithm in "Python" and using the "sklearn library" (randomized_svd function), but it does not specify version numbers for these software components.
Experiment Setup	Yes	In all our experiments, we use ρ = 2 and δ = 10 5, t = 125 (the number of subsets in the sampleand-aggregate process), n = 2tk data points, q = 10 k (the number of reference points in the aggregation), and use the z CDP implementation of the Friendly Core-based averaging algorithm of Tsfadia et al. [2022].