reproducibilityindex.ai

Competitive Distribution Estimation: Why is Good-Turing Good

Authors: Alon Orlitsky, Ananda Theertha Suresh

NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Figure 2: Simulation results for support 10000, number of samples ranging from 1000 to 50000, averaged over 200 trials. and We compare the performance of this estimator to four estimators
Researcher Affiliation	Academia	Alon Orlitsky UC San Diego alon@ucsd.edu Ananda Theertha Suresh UC San Diego asuresh@ucsd.edu
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks. The methods are described mathematically and textually.
Open Source Code	No	No concrete access to source code (specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described in this paper was found.
Open Datasets	No	The paper describes generating data from various distributions (e.g., 'Uniform', 'Zipf', 'Dirichlet prior') for simulations, but does not refer to or provide access information for a publicly available or open dataset.
Dataset Splits	No	The paper describes simulation parameters like 'number of samples ranging from 1000 to 50000, averaged over 200 trials', but does not specify dataset splits (training, validation, test) or cross-validation setup for reproducibility.
Hardware Specification	No	No specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running experiments were mentioned in the paper.
Software Dependencies	No	The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup	Yes	For symbols appearing t times, if ϕt+1 Ω(t), then the Good-Turing estimate is close to the underlying total probability mass, otherwise the empirical estimate is closer. Hence, for a symbol appearing t times, if ϕt = t, we use the Good-Turing estimator, otherwise we use the empirical estimator. If nx = t, qx(xn) = ( t / N if t > ϕt+1, ϕt+1+1 / ϕt+1 * nx / N else) ... All distributions have support size k = 10000. n ranges from 1000 to 50000 and the results are averaged over 200 trials.