Improved Utility Analysis of Private CountSketch

Authors: Rasmus Pagh, Mikkel Thorup

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We have conducted experiments to empirically investigate the properties of private Count Sketch. Our main result ignores constant factors in the exponent of the bound on failure probability, but we see in experiments that these constants are very reasonable. All experiments can be recreated by running a Python script available on Git Hub 1 (runs in 13 minutes on an M1 Mac Book Pro). Density plots are smoothed using the Seaborn library s kdeplot function with default parameters.
Researcher Affiliation Academia Rasmus Pagh Basic Algorithms Research Copenhagen University of Copenhagen pagh@di.ku.dk Mikkel Thorup Basic Algorithms Research Copenhagen University of Copenhagen mikkel2thorup@gmail.com
Pseudocode No The paper does not contain pseudocode or clearly labeled algorithm blocks.
Open Source Code Yes All experiments can be recreated by running a Python script available on Git Hub 1 (runs in 13 minutes on an M1 Mac Book Pro).
Open Datasets Yes Retrieved 2022-01-27 from https://simplemaps.com/data/world-cities (for world cities dataset) and market basket data sets obtained from the FIMI data collection, http://fimi.uantwerpen.be/data/ (for kosarak and retail datasets).
Dataset Splits No The paper describes the datasets used but does not provide specific train/validation/test split percentages or sample counts for its experiments.
Hardware Specification Yes All experiments can be recreated by running a Python script available on Git Hub 1 (runs in 13 minutes on an M1 Mac Book Pro).
Software Dependencies No The paper mentions "Density plots are smoothed using the Seaborn library s kdeplot function with default parameters." but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes For privacy parameters ε = 1 and δ = 10 6 (with respect to a single market basket) we see that the error of Private Count Sketch is nearly as well concentrated as the error of Count Sketch. The table size b of the sketches is chosen to roughly balance the noise from the Count Sketch itself and the Gaussian mechanism. The scale of the noise added for differential privacy is σ = 10,000 (left) and σ = 100,000 (right).