The Broad Optimality of Profile Maximum Likelihood
Authors: Yi Hao, Alon Orlitsky
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 Experiments and distribution estimation under ℓ1 distance. In Figures 2, samples are generated according to six distributions of the same support size k = 5,000. Details about these distributions can be found in Section 4.2 of the supplementary material. The sample size n (horizontal axis) ranges from 10,000 to 100,000, and the vertical axis reflects the (unsorted) ℓ1 distance between the true distribution and the estimates, averaged over 30 independent trials. We compare our estimator with three different ones: the improved Good-Turing estimator in [56, 41], which is provably instance-by-instance near-optimal [56], the empirical estimator, serving as a baseline, and the empirical estimator with a larger n log n sample size. |
| Researcher Affiliation | Academia | Yi Hao Dept. of Electrical and Computer Engineering University of California, San Diego yih179@ucsd.edu Alon Orlitsky Dept. of Electrical and Computer Engineering University of California, San Diego alon@ucsd.edu |
| Pseudocode | Yes | Figure 1: Uniformity tester TPML Input: parameters k, ε, and a sample Xn p with profile ϕ. if maxxµx(Xn) 3 max{1, n/k} log k then return 1; elif pϕ pu 2 3ε/(4 k) then return 1; else return 0. |
| Open Source Code | No | The paper does not provide a concrete statement about releasing source code or a link to a code repository for the methodology described. |
| Open Datasets | No | The paper mentions 'samples are generated according to six distributions' and 'Details about these distributions can be found in Section 4.2 of the supplementary material', but does not provide concrete access information (specific link, DOI, repository name, formal citation with authors/year) for a publicly available or open dataset. |
| Dataset Splits | No | The paper discusses sampling for estimation and testing but does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) for reproduction. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment. |
| Experiment Setup | No | The paper describes high-level experimental parameters like sample size ranges but does not provide specific experimental setup details such as concrete hyperparameter values, training configurations, or system-level settings for any models or algorithms. |