reproducibilityindex.ai

The Broad Optimality of Profile Maximum Likelihood

Authors: Yi Hao, Alon Orlitsky

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	4 Experiments and distribution estimation under ℓ1 distance. In Figures 2, samples are generated according to six distributions of the same support size k = 5,000. Details about these distributions can be found in Section 4.2 of the supplementary material. The sample size n (horizontal axis) ranges from 10,000 to 100,000, and the vertical axis reﬂects the (unsorted) ℓ1 distance between the true distribution and the estimates, averaged over 30 independent trials. We compare our estimator with three different ones: the improved Good-Turing estimator in [56, 41], which is provably instance-by-instance near-optimal [56], the empirical estimator, serving as a baseline, and the empirical estimator with a larger n log n sample size.
Researcher Affiliation	Academia	Yi Hao Dept. of Electrical and Computer Engineering University of California, San Diego yih179@ucsd.edu Alon Orlitsky Dept. of Electrical and Computer Engineering University of California, San Diego alon@ucsd.edu
Pseudocode	Yes	Figure 1: Uniformity tester TPML Input: parameters k, ε, and a sample Xn p with proﬁle ϕ. if maxxµx(Xn) 3 max{1, n/k} log k then return 1; elif pϕ pu 2 3ε/(4 k) then return 1; else return 0.
Open Source Code	No	The paper does not provide a concrete statement about releasing source code or a link to a code repository for the methodology described.
Open Datasets	No	The paper mentions 'samples are generated according to six distributions' and 'Details about these distributions can be found in Section 4.2 of the supplementary material', but does not provide concrete access information (specific link, DOI, repository name, formal citation with authors/year) for a publicly available or open dataset.
Dataset Splits	No	The paper discusses sampling for estimation and testing but does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) for reproduction.
Hardware Specification	No	The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment.
Experiment Setup	No	The paper describes high-level experimental parameters like sample size ranges but does not provide specific experimental setup details such as concrete hyperparameter values, training configurations, or system-level settings for any models or algorithms.