reproducibilityindex.ai

WOR and $p$'s: Sketches for $\ell_p$-Sampling Without Replacement

Authors: Edith Cohen, Rasmus Pagh, David Woodruff

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our design is simple and practical, despite intricate analysis, and based on off-the-shelf use of widely implemented heavy hitters sketches such as Count Sketch. Our method is the ﬁrst to provide WOR sampling in the important regime of p > 1 and the ﬁrst to handle signed updates for p > 0. ... As a bonus, we include practical optimizations (that preserve the theoretical guarantees) and perform experiments that demonstrate both the practicality and accuracy of WORp.
Researcher Affiliation	Collaboration	Edith Cohen Google Research Tel Aviv University edith@cohenwang.com Rasmus Pagh IT University of Copenhagen BARC Google Research pagh@itu.dk David P. Woodruff CMU dwoodruf@cs.cmu.edu
Pseudocode	Yes	Algorithm 1: WORp (high level)
Open Source Code	Yes	Code for the experiments is provided in the following Colab notebook https://colab.research. google.com/drive/1Tix7Sws Pp7A_Ot Sua Rf3Iwf TH-qo9_81?usp=sharing
Open Datasets	No	The paper uses Zipfian distributions for its experiments, which are standard synthetic data models. While reproducible by definition, the paper does not provide a specific URL, DOI, repository, or formal citation to a pre-existing, externally hosted 'dataset' instance of these distributions, which is required for 'concrete access information'.
Dataset Splits	No	The paper does not specify training, validation, and test dataset splits. The experiments involve simulations and repetitions on data generated from Zipfian distributions rather than fixed dataset splits for model training.
Hardware Specification	No	The paper does not provide any specific details about the hardware used to run the experiments, such as GPU/CPU models or other computing specifications.
Software Dependencies	No	The paper mentions 'Python' and 'Count Sketch' as software components used in the experiments but does not provide specific version numbers for these, which is required for a reproducible description of ancillary software.
Experiment Setup	Yes	We simulated 2-pass and 1-pass WORp in Python using Count Sketch with 15 repetitions and table size 2k (total space 30k) as our r HH sketch.