reproducibilityindex.ai

On the Efficient Implementation of High Accuracy Optimality of Profile Maximum Likelihood

Authors: Moses Charikar, Zhihao Jiang, Kirankumar Shiragur, Aaron Sidford

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We provide an efﬁcient uniﬁed plug-in approach for estimating symmetric properties of distributions given n independent samples. Our estimator is based on proﬁle-maximum-likelihood (PML) and is sample optimal for estimating various symmetric properties when the estimation error ϵ n 1/3. This result improves upon the previous best accuracy threshold of ϵ n 1/4 achievable by polynomial time computable PML-based universal estimators [ACSS21, ACSS20]. Our estimator reaches a theoretical limit for universal symmetric property estimation as [Han21] shows that a broad class of universal estimators (containing many well known approaches including ours) cannot be sample optimal for every 1-Lipschitz property when ϵ n 1/3. ... In Section 2, we state our main results and also cover related work. In Section 3, we provide the convex relaxation to PML studied in [CSS19a, ACSS21]. Finally, in Section 4, we provide a proof sketch of our main computational result. Many proofs are then differed to the appendix.
Researcher Affiliation	Academia	Moses Charikar Stanford University moses@cs.stanford.edu Zhihao Jiang Stanford University faebdc@stanford.edu Kirankumar Shiragur Stanford University shiragur@stanford.edu Aaron Sidford Stanford University sidford@stanford.edu
Pseudocode	Yes	Algorithm 1 Approximate PML(φ, R = Rn,α) ... Algorithm 2 swapmatrixround(A)
Open Source Code	No	The paper does not contain any explicit statements about releasing source code or links to a code repository.
Open Datasets	No	The paper is theoretical and focuses on algorithm design and analysis. It does not involve empirical training on specific datasets; rather, it deals with theoretical properties of estimators given 'n independent samples y1, ..., yn D from an unknown discrete distribution p D'.
Dataset Splits	No	The paper is theoretical and focuses on algorithm design and analysis. It does not describe an experimental setup with dataset splits for validation.
Hardware Specification	No	The paper focuses on theoretical algorithms and their time complexity. It does not describe any specific hardware (e.g., GPU models, CPU types) used to run experiments.
Software Dependencies	No	The paper mentions "CVX solver" in the limitations section, stating it "is not practical for large sample instances," implying it was not used for the reported theoretical results or that its practicality is a future challenge, and no version number is given. No other specific software dependencies with version numbers are provided.
Experiment Setup	No	The paper is theoretical and describes algorithms and their theoretical guarantees. It does not contain details about an experimental setup, such as specific hyperparameter values, optimization settings, or training configurations.