Worst-Case Analysis for Randomly Collected Data
Authors: Justin Chen, Gregory Valiant, Paul Valiant
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experimentally demonstrate the benefit of this framework and our algorithm in comparison to standard estimators, for several such settings. |
| Researcher Affiliation | Academia | Justin Y. Chen MIT justc@mit.edu Gregory Valiant Stanford University gvaliant@cs.stanford.edu Paul Valiant IAS and Purdue University pvaliant@gmail.com |
| Pseudocode | Yes | Algorithm 1 SDP Algorithm yielding π 2 -approximation to the best semilinear estimator |
| Open Source Code | Yes | our code is available at https://github. com/justc2/worst-case-randomly-collected. |
| Open Datasets | No | The paper utilizes synthetic data generated based on specified parameters (e.g., 'n = 50 elements where the ith element is included... with probability pi', 'n = 50 points is drawn uniformly from the 2D unit square'). It does not refer to or provide access information for a publicly available or open dataset in the conventional sense. |
| Dataset Splits | No | The paper describes how samples and target sets are drawn according to a known joint distribution P, but it does not specify explicit train/validation/test dataset splits with percentages, sample counts, or predefined split references for model training or evaluation. |
| Hardware Specification | No | The paper does not explicitly describe the hardware used for its experiments, such as specific GPU or CPU models. |
| Software Dependencies | Yes | our implementation of Algorithm 1 using the Python CVXPY package [9, 1] with the MOSEK solver [2]...MOSEK Optimizer API for Python 9.2.10, 2019. |
| Experiment Setup | Yes | We consider a set of n = 50 elements where the ith element is included in the sample set independently with probability pi, with p1, . . . , p25 = 0.1 and p26, ..., p50 = 0.5. The target set is the entire population, i.e. the goal is to estimate the population mean. |