Differentially Private Linear Sketches: Efficient Implementations and Applications

Authors: Fuheng Zhao, Dan Qiao, Rachel Redberg, Divyakant Agrawal, Amr El Abbadi, Yu-Xiang Wang

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We have implemented DP linear sketches and DP DCS, and conducted extensive experiments to evaluated the privacy-utility trade-off of our proposed private sketches.
Researcher Affiliation Academia Fuheng Zhao fuheng_zhao@ucsb.edu Dan Qiao danqiao@ucsb.edu Rachel Redberg rredberg@ucsb.edu Divyakant Agrawal agrawal@cs.ucsb.edu Amr El Abbadi amr@cs.ucsb.edu Yu-Xiang Wang yuxiangw@ucsb.edu Department of Computer Science, UC Santa Barbara.
Pseudocode Yes Algorithm 1 Linear Sketch Update(x, v), Algorithm 2 Linear Sketch Query(x), Algorithm 3 DP Linear Sketch Initialization with Gaussian Noise
Open Source Code Yes The code for the following experiments can be found on Github 3. (Footnote 3: https://github.com/ZhaoFuheng/Differentially-Private-Linear-Sketches)
Open Datasets Yes We consider the synthetic Zipf dataset Zipf [2016] with universe size of 2^16 and the source IP addresses from CAIDA Anonymized Internet Trace 2015 dataset pas with universe size of 2^32. (Bibliography entry: Anonymized internet traces 2015. https://catalog.caida.org/details/dataset/passive_ 2015_pcap. Accessed: 2022-5-10.)
Dataset Splits No The paper mentions an input database size N = 10^5, but does not provide explicit training, validation, and test dataset splits, percentages, or methodology for partitioning data.
Hardware Specification Yes we didn t use any external resources beside a macbook pro.
Software Dependencies No The paper states 'The implementations are written in Python' but does not specify the Python version or any other software dependencies with version numbers.
Experiment Setup Yes The experiments assume β = 1% and N = 10^5. The DP DCS use privacy budget ε ∈ {0.1, 1, 10} and all sketches assume γ = 1%.