Strategyproof Mean Estimation from Multiple-Choice Questions
Authors: Anson Kahng, Gregory Kehne, Ariel Procaccia
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we conduct experiments in the latter setting of known distributions, in which we aim to quantify the benefit of tailoring the estimator to the distributions. We focus on MSE due to our computational results and show that the optimal estimator significantly outperforms a naïve estimator. In this section we aim to answer this question. We focus on the MSE as our measure of error, because Theorem 3 shows that an optimal estimator with respect to MAE is hard to compute. In more detail, we compare the MSE-optimal prior-sensitive estimator of Theorem 2 to the deterministic worst-case optimal strategy described in Section 3, which does not incorporate knowledge of prior distributions. Figure 2 shows sample averages of MSE for both the uniform and optimal estimators applied to distributions from the Gaussian family, for a range of n and for fixed k = 3, or for a range of k and fixed n = 50. The MSE for each pair of generated distribution P and estimator is measured as an average over 1000 draws p P. |
| Researcher Affiliation | Academia | 1Computer Science Department, Carnegie Mellon University 2Department of Mathematical Sciences, Carnegie Mellon University 3School of Engineering and Applied Sciences, Harvard University. |
| Pseudocode | No | The paper describes algorithms (e.g., dynamic programming for k-means), but there are no figures or blocks labeled as 'Pseudocode' or 'Algorithm'. |
| Open Source Code | No | The paper does not mention releasing source code, providing a repository link, or making it available in supplementary materials. |
| Open Datasets | No | The paper describes generating synthetic data from specified distributions (uniform, Gaussian, bimodal) for experiments, rather than using a publicly available dataset with concrete access information. |
| Dataset Splits | No | The paper does not provide specific percentages or counts for training, validation, or test splits. It mentions 'drawing m samples' and '1000 draws p P' without explicit data partitioning details for reproduction. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU or CPU models, or cloud computing instance specifications used for running the experiments. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x) that would be needed to replicate the experiments. |
| Experiment Setup | No | The paper describes how instances are generated and evaluated (e.g., '1000 draws p P'), but does not provide specific hyperparameter values, optimizer settings, or detailed training configurations for replication. |