reproducibilityindex.ai

Differentially Private n-gram Extraction

Authors: Kunho Kim, Sivakanth Gopi, Janardhan Kulkarni, Sergey Yekhanin

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we empirically evaluate the performance of our algorithms on two datasets: Reddit and MSNBC.
Researcher Affiliation	Industry	Kunho Kim Microsoft kuki@microsoft.com Sivakanth Gopi Microsoft Research sigopi@microsoft.com Janardhan Kulkarni Microsoft Research jakul@microsoft.com Sergey Yekhanin Microsoft Research yekhanin@microsoft.com
Pseudocode	Yes	In this section we describe our algorithm for DPNE. The pseudocode is presented in Algorithm 1.
Open Source Code	Yes	Code available at https://github.com/microsoft/differentially-private-ngram-extraction
Open Datasets	Yes	The Reddit data set is a natural language dataset used extensively in NLP applications, and is taken from Tensor Flow repository.5 The MSNBC dataset consists page visits of users who browsed msnbc.com on September 28, 1999, and is recorded at the level of URL and ordered by time.6
Dataset Splits	No	The paper uses Reddit and MSNBC datasets but does not specify how these datasets were split into training, validation, or test sets for their experiments. No explicit percentages, counts, or references to standard splits are provided for reproducibility.
Hardware Specification	No	The paper does not specify any hardware details such as GPU models, CPU types, or memory used for running the experiments.
Software Dependencies	No	The paper does not specify any software dependencies with their version numbers, such as programming languages, libraries, or frameworks used for implementation or experimentation.
Experiment Setup	Yes	Throughout this section we ﬁx T = 9, ε = 4, δ = 10 7, 1 = = 9 = 0 = 300, η = 0.01 unless otherwise speciﬁed.