Surprisingly Popular Voting Recovers Rankings, Surprisingly!

Authors: Hadi Hosseini, Debmalya Mandal, Nisarg Shah, Kevin Shi

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We experimentally demonstrate that even a little prediction information helps surprisingly popular voting outperform classical approaches.
Researcher Affiliation Academia 1Pennsylvania State University 2Columbia University 3University of Toronto
Pseudocode No The formal algorithm and its detailed description are provided in the full version. (Footnote 1: https://arxiv.org/abs/2105.09386) The pseudocode is not present in this excerpt of the paper.
Open Source Code No The paper does not contain an explicit statement or link to open-source code for the methodology presented. Footnote 1 links to the full version of the paper on arXiv, not a code repository.
Open Datasets Yes To generate questions with an underlying ground truth comparison of alternatives, we used three datasets from three distinct domains: 1. The geography dataset2 contains 230 countries with their 2019 population estimates according to the United Nations. 2. The movies dataset3 contains 15,743 movies with their lifetime box-office gross earnings. 3. The paintings dataset4 contains 80 paintings with their latest auction prices. (Footnotes provide sources: 2Retrieved from worldpopulationreview.com, 3Retrived from boxofficemojo.com/chart/top lifetime gross, 4Generously provided by the authors of Prelec et al. [2017].)
Dataset Splits Yes To learn effective values of these parameters, we split the dataset into a training and a test set. For each elicitation format, we selected 5 questions from each of three domains, reserving the remaining 15 questions from each domain for the test set. Using these 15 questions, we performed a grid search over α ranging from 0.55 to 0.95 in increments of 0.025 and β ranging from 0.05 to 0.45 in increments of 0.025 and selected the values with the lowest mean squared error.
Hardware Specification No The paper describes an empirical study conducted with human participants on Amazon Mechanical Turk, but it does not specify any hardware used by the authors for data processing, model training, or analysis.
Software Dependencies No The paper does not provide specific version numbers for any software dependencies used in the experiments (e.g., programming languages, libraries, frameworks, or solvers).
Experiment Setup Yes Next, we conduct an empirical study with 720 participants from Amazon s Mechanical Turk platform. ... For each of the 60 questions and each of the 6 elicitation formats described in Section 3, we elicited 20 responses, generating a total of 7, 200 responses. ... To learn effective values of these parameters, we split the dataset into a training and a test set. ... we performed a grid search over α ranging from 0.55 to 0.95 in increments of 0.025 and β ranging from 0.05 to 0.45 in increments of 0.025 and selected the values with the lowest mean squared error.