Improved Utility Analysis of Private CountSketch
Authors: Rasmus Pagh, Mikkel Thorup
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We have conducted experiments to empirically investigate the properties of private Count Sketch. Our main result ignores constant factors in the exponent of the bound on failure probability, but we see in experiments that these constants are very reasonable. All experiments can be recreated by running a Python script available on Git Hub 1 (runs in 13 minutes on an M1 Mac Book Pro). Density plots are smoothed using the Seaborn library s kdeplot function with default parameters. |
| Researcher Affiliation | Academia | Rasmus Pagh Basic Algorithms Research Copenhagen University of Copenhagen pagh@di.ku.dk Mikkel Thorup Basic Algorithms Research Copenhagen University of Copenhagen mikkel2thorup@gmail.com |
| Pseudocode | No | The paper does not contain pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | All experiments can be recreated by running a Python script available on Git Hub 1 (runs in 13 minutes on an M1 Mac Book Pro). |
| Open Datasets | Yes | Retrieved 2022-01-27 from https://simplemaps.com/data/world-cities (for world cities dataset) and market basket data sets obtained from the FIMI data collection, http://fimi.uantwerpen.be/data/ (for kosarak and retail datasets). |
| Dataset Splits | No | The paper describes the datasets used but does not provide specific train/validation/test split percentages or sample counts for its experiments. |
| Hardware Specification | Yes | All experiments can be recreated by running a Python script available on Git Hub 1 (runs in 13 minutes on an M1 Mac Book Pro). |
| Software Dependencies | No | The paper mentions "Density plots are smoothed using the Seaborn library s kdeplot function with default parameters." but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | For privacy parameters ε = 1 and δ = 10 6 (with respect to a single market basket) we see that the error of Private Count Sketch is nearly as well concentrated as the error of Count Sketch. The table size b of the sketches is chosen to roughly balance the noise from the Count Sketch itself and the Gaussian mechanism. The scale of the noise added for differential privacy is σ = 10,000 (left) and σ = 100,000 (right). |