Unified Sample-Optimal Property Estimation in Near-Linear Time
Authors: Yi Hao, Alon Orlitsky
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We consider the fundamental learning problem of estimating properties of distributions over large domains. Using a novel piecewise-polynomial approximation technique, we derive the first unified methodology for constructing sampleand time-efficient estimators for all sufficiently smooth, symmetric and non-symmetric, additive properties. This technique yields near-linear-time computable estimators whose approximation values are asymptotically optimal and highly-concentrated, resulting in the first: 1) estimators achieving the O(k/(ε2 log k)) min-max ε-error sample complexity for all k-symbol Lipschitz properties; 2) unified near-optimal differentially private estimators for a variety of properties; 3) unified estimator achieving optimal bias and near-optimal variance for five important properties; 4) near-optimal sample-complexity estimators for several important symmetric properties over both domain sizes and confidence levels. |
| Researcher Affiliation | Academia | Yi Hao Dept. of Electrical and Computer Engineering University of California, San Diego yih179@ucsd.edu Alon Orlitsky Dept. of Electrical and Computer Engineering University of California, San Diego alon@ucsd.edu |
| Pseudocode | No | The paper describes the methodology conceptually and mathematically but does not include explicit pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any statement about releasing source code for the described methodology. |
| Open Datasets | No | This is a theoretical paper presenting a methodology and its properties, not empirical work involving training on datasets. Therefore, there is no mention of publicly available datasets for training. |
| Dataset Splits | No | This is a theoretical paper presenting a methodology and its properties, not empirical work involving data splits for validation. Therefore, there is no mention of training/validation/test splits. |
| Hardware Specification | No | This is a theoretical paper. No specific hardware used for experiments is mentioned. |
| Software Dependencies | No | The paper mentions the 'Remez algorithm' as a computational tool for approximating polynomials, but does not list specific software dependencies with version numbers. |
| Experiment Setup | No | This is a theoretical paper describing an estimation methodology. It does not provide details of an experimental setup such as hyperparameters or training configurations. |