reproducibilityindex.ai

Strategyproof Mean Estimation from Multiple-Choice Questions

Authors: Anson Kahng, Gregory Kehne, Ariel Procaccia

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we conduct experiments in the latter setting of known distributions, in which we aim to quantify the beneﬁt of tailoring the estimator to the distributions. We focus on MSE due to our computational results and show that the optimal estimator signiﬁcantly outperforms a naïve estimator. In this section we aim to answer this question. We focus on the MSE as our measure of error, because Theorem 3 shows that an optimal estimator with respect to MAE is hard to compute. In more detail, we compare the MSE-optimal prior-sensitive estimator of Theorem 2 to the deterministic worst-case optimal strategy described in Section 3, which does not incorporate knowledge of prior distributions. Figure 2 shows sample averages of MSE for both the uniform and optimal estimators applied to distributions from the Gaussian family, for a range of n and for ﬁxed k = 3, or for a range of k and ﬁxed n = 50. The MSE for each pair of generated distribution P and estimator is measured as an average over 1000 draws p P.
Researcher Affiliation	Academia	1Computer Science Department, Carnegie Mellon University 2Department of Mathematical Sciences, Carnegie Mellon University 3School of Engineering and Applied Sciences, Harvard University.
Pseudocode	No	The paper describes algorithms (e.g., dynamic programming for k-means), but there are no figures or blocks labeled as 'Pseudocode' or 'Algorithm'.
Open Source Code	No	The paper does not mention releasing source code, providing a repository link, or making it available in supplementary materials.
Open Datasets	No	The paper describes generating synthetic data from specified distributions (uniform, Gaussian, bimodal) for experiments, rather than using a publicly available dataset with concrete access information.
Dataset Splits	No	The paper does not provide specific percentages or counts for training, validation, or test splits. It mentions 'drawing m samples' and '1000 draws p P' without explicit data partitioning details for reproduction.
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU or CPU models, or cloud computing instance specifications used for running the experiments.
Software Dependencies	No	The paper does not list specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x) that would be needed to replicate the experiments.
Experiment Setup	No	The paper describes how instances are generated and evaluated (e.g., '1000 draws p P'), but does not provide specific hyperparameter values, optimizer settings, or detailed training configurations for replication.