A Bayesian Nonparametric View on Count-Min Sketch
Authors: Diana Cai, Michael Mitzenmacher, Ryan P. Adams
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Using simulated data and text data, we investigate the properties of these estimators. and We now examine the Bayesian posterior query and point estimates obtained using the CM sketch applied to several data streams. in Section 5 Experiments. |
| Researcher Affiliation | Academia | Diana Cai Princeton University dcai@cs.princeton.edu Michael Mitzenmacher Harvard University michaelm@eecs.harvard.edu Ryan P. Adams Princeton University rpa@princeton.edu |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any concrete access information (specific repository link, explicit code release statement, or code in supplementary materials) for the source code of the methodology described. |
| Open Datasets | Yes | We constructed a stream of tokens using the 20 Newsgroups data set, where the sketch was updated using the training data set (M = 1467345), and evaluated queries on the set of unique tokens in the test set, which had 53975 elements. |
| Dataset Splits | No | The paper mentions training and test sets but does not explicitly provide details for a validation set split. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | For each task, we examined several hash parameter settings of N = 4, 5, 6, with J = 8000, 10000, 12000. For the posterior distribution of the count, we used the Dirichlet process sketching model, inferring α via empirical Bayes. |