Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Convergence Guarantees for the Good-Turing Estimator

Authors: Amichai Painsky

JMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	An extensive empirical study which demonstrates the performance of the proposed estimator, compared to currently known schemes. The rest of the manuscript is organized as follows. Finally, in Section 8 we compare our suggested framework with currently known estimators in a series of synthetic and real-world experiments.
Researcher Affiliation	Academia	Amichai Painsky EMAIL Department of Industrial Engineering Tel Aviv University Tel Aviv, Israel
Pseudocode	No	The paper focuses on mathematical derivations, theorems, and proofs related to the Good-Turing estimator. It does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain any explicit statements about releasing source code, nor does it provide links to any code repositories.
Open Datasets	Yes	We begin with a corpus linguistic experiment. The popular Broadway play Hamilton consists of 20,520 words, of which m = 3,578 are distinct. Gao et al. (2007) considered the forearm skin biota of six subjects. Finally, we study census data. The lower row of Figure 5 considers the 2000 United States Census (Bureau, 2014), which lists the frequency of the top m = 1000 most common last names in the United States.
Dataset Splits	No	In each experiment we draw n samples, and compare the occupancy probabilities Mk(Xn) with their corresponding estimators, for diﬀerent values of k. To attain an averaged error, we repeat each experiment 1000 times, and average the squared error. The paper describes a sampling and resampling evaluation methodology rather than traditional dataset splits for training, validation, and testing.
Hardware Specification	No	The paper does not provide any specific details about the hardware used to run the experiments, such as GPU/CPU models or other computer specifications.
Software Dependencies	No	The paper does not mention any specific software or library names along with their version numbers that would be necessary to replicate the experiments.
Experiment Setup	No	The paper describes the mathematical formulations of the estimators and analyzes their convergence rates. While it discusses sample sizes (n) and k values for evaluation, it does not specify hyperparameters, training configurations, or system-level settings typically found in experimental setups for machine learning models.