reproducibilityindex.ai

Perturb-and-Project: Differentially Private Similarities and Marginals

Authors: Vincent Cohen-Addad, Tommaso D’Orsi, Alessandro Epasto, Vahab Mirrokni, Peilin Zhong

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	In this work, we study input perturbation techniques from a theoretical perspective. We focus on the problem of designing differentially private projection functions (such as e.g., dimensionality reduction techniques). We observe that, quite surprisingly, input perturbations yield the best known (and conjecturally optimal) guarantees for a large class of projection functions. This challenges the expectation that a general approach might incur sub-optimal utility loss. Our analysis reveals that the algorithm s utility guarantees are contingent upon the richness (metric entropy) of the target set (the set into which the input is projected). Inspired by (d Orsi et al., 2023), we use sum-of-squares to obtain tight bounds on the Gaussian complexity of set of solutions. To further improve, we introduce new sum-of-square certiﬁcates (e.g. for sparse injective tensor norms). Finally, while our sum-of-squares solution might appear complex, we extract an interesting explanation for the practical success of algorithms like linear projection.
Researcher Affiliation	Collaboration	1Google Research 2BIDSA, Bocconi.
Pseudocode	Yes	Algorithm 1 Perturb-and-project; Algorithm 2 Perturb-and-alternately-project; Algorithm 3 Private pair-wise cosine similarities; Algorithm 4 Practical pair-wise cosine similarities
Open Source Code	No	The paper does not provide any statement or link indicating that open-source code for the described methodology is available.
Open Datasets	No	The paper is theoretical and does not conduct experiments on specific public datasets. It discusses 'datasets' in a general sense (e.g., 'dataset of vectors in {0, 1}n') but does not name or provide access information for any public or open dataset used for training or evaluation.
Dataset Splits	No	The paper is theoretical and does not describe experimental validation procedures, such as dataset splits (training, validation, test) or cross-validation setups.
Hardware Specification	No	The paper focuses on theoretical contributions and algorithm design; therefore, it does not mention any specific hardware specifications used for experiments.
Software Dependencies	No	The paper does not list any specific software dependencies with version numbers for replication.
Experiment Setup	No	The paper is theoretical and focuses on algorithms and proofs; it does not provide specific experimental setup details such as hyperparameters or training configurations.