reproducibilityindex.ai

Surprisingly Popular Voting Recovers Rankings, Surprisingly!

Authors: Hadi Hosseini, Debmalya Mandal, Nisarg Shah, Kevin Shi

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We experimentally demonstrate that even a little prediction information helps surprisingly popular voting outperform classical approaches.
Researcher Affiliation	Academia	1Pennsylvania State University 2Columbia University 3University of Toronto
Pseudocode	No	The formal algorithm and its detailed description are provided in the full version. (Footnote 1: https://arxiv.org/abs/2105.09386) The pseudocode is not present in this excerpt of the paper.
Open Source Code	No	The paper does not contain an explicit statement or link to open-source code for the methodology presented. Footnote 1 links to the full version of the paper on arXiv, not a code repository.
Open Datasets	Yes	To generate questions with an underlying ground truth comparison of alternatives, we used three datasets from three distinct domains: 1. The geography dataset2 contains 230 countries with their 2019 population estimates according to the United Nations. 2. The movies dataset3 contains 15,743 movies with their lifetime box-ofﬁce gross earnings. 3. The paintings dataset4 contains 80 paintings with their latest auction prices. (Footnotes provide sources: 2Retrieved from worldpopulationreview.com, 3Retrived from boxofﬁcemojo.com/chart/top lifetime gross, 4Generously provided by the authors of Prelec et al. [2017].)
Dataset Splits	Yes	To learn effective values of these parameters, we split the dataset into a training and a test set. For each elicitation format, we selected 5 questions from each of three domains, reserving the remaining 15 questions from each domain for the test set. Using these 15 questions, we performed a grid search over α ranging from 0.55 to 0.95 in increments of 0.025 and β ranging from 0.05 to 0.45 in increments of 0.025 and selected the values with the lowest mean squared error.
Hardware Specification	No	The paper describes an empirical study conducted with human participants on Amazon Mechanical Turk, but it does not specify any hardware used by the authors for data processing, model training, or analysis.
Software Dependencies	No	The paper does not provide specific version numbers for any software dependencies used in the experiments (e.g., programming languages, libraries, frameworks, or solvers).
Experiment Setup	Yes	Next, we conduct an empirical study with 720 participants from Amazon s Mechanical Turk platform. ... For each of the 60 questions and each of the 6 elicitation formats described in Section 3, we elicited 20 responses, generating a total of 7, 200 responses. ... To learn effective values of these parameters, we split the dataset into a training and a test set. ... we performed a grid search over α ranging from 0.55 to 0.95 in increments of 0.025 and β ranging from 0.05 to 0.45 in increments of 0.025 and selected the values with the lowest mean squared error.