Unmasking Vulnerabilities: Cardinality Sketches under Adaptive Inputs
Authors: Sara Ahmadian, Edith Cohen
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, our attack used only 4k queries with the widely used Hyper Log Log (HLL++) (Flajolet et al., 2007b; Heule et al., 2013) sketch. We conduct an empirical evaluation of our proposed attack on the Hyper Log Log (HLL) sketch (Durand & Flajolet, 2003; Flajolet et al., 2007a) with the HLL++ estimator (Heule et al., 2013). |
| Researcher Affiliation | Collaboration | 1Google Research, United States 2Department of Computer Science, Tel Aviv University, Israel. |
| Pseudocode | Yes | Algorithm 1: Attack standard estimators. Algorithm 3: Single Batch Attacker. Algorithm 4: Adaptive Attacker. |
| Open Source Code | No | The paper mentions utilizing 'the open-source implementation of HLL++ algorithm in github' but does not state that the code developed for this paper is open-source or provide a link to it. |
| Open Datasets | No | The paper states: 'To generate the data... we generate random strings using the English alphabet of a fixed length'. It does not refer to a publicly available dataset or provide access information for the data generated for the experiments. |
| Dataset Splits | No | The paper does not provide specific dataset split information (e.g., percentages, sample counts, or detailed methodology for splitting into training, validation, or test sets). |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions utilizing 'the open-source implementation of HLL++ algorithm in github' but does not specify its version number or any other software dependencies with version numbers. |
| Experiment Setup | Yes | We set the size of our ground set n k to be in this relevant regime. We consider two different error rates, ϵ = 0.1 with corresponding sketch size k = 104 and ϵ = 0.05, with corresponding sketch size k = 416. We use the same ground set comprising of 5000 keys for both sets of experiments. For each sketch size k, we generate a ground set of size n = 10 · 10 log10(k) to ensure that the ground set is larger than sketch size and the Min Hash component of the HLL++ estimator is used. |