The Surprising Effectiveness of SP Voting with Partial Preferences

Authors: Hadi Hosseini, Debmalya Mandal, Amrit Puhan

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through a large-scale crowdsourcing experiment on MTurk, we show that both of our approaches outperform conventional preference aggregation algorithms for the recovery of ground truth rankings, when measured in terms of Kendall-Tau distance and Spearman s . We conduct a human-subject study with 432 participants recruited from Amazon s Mechanical Turk (MTurk) to empirically evaluate the performance of our SP algorithms using metrics such as the Kendall-Tau distance from the full ground truth ranking and Spearman s rank correlation coefficient. We further analyze the collected data and demonstrate that voters behavior in the experiment, including the minority of the experts, and the SP phenomenon, can be correctly simulated by a concentric mixtures of Mallows model.
Researcher Affiliation Academia Hadi Hosseini College of Information Sciences and Technology Penn State University, USA hadi@psu.edu Debmalya Mandal Department of Computer Science University of Warwick, UK Debmalya.Mandal@warwick.ac.uk Amrit Puhan College of Information Sciences and Technology Penn State University, USA avp6267@psu.edu
Pseudocode Yes Explanation and pseudocode for Partial-SP and Aggregated-SP is provided in Appendix D.2 and Appendix D.3, respectively. ALGORITHM 1: Extract-Reports, ALGORITHM 2: Partial-SP, ALGORITHM 3: Aggregated-SP Aggregation.
Open Source Code Yes The dataset can be found here -https://github.com/amrit19/Surprisingly-Popular-Voting-Partial. The associated NeurIPS checklist also indicates that code is provided for reproducibility.
Open Datasets Yes The survey encompassed three distinct domains: (i) The geography dataset contains 36 countries with their population estimates, according to the United Nations, (ii) The movies dataset contains 36 movies with their lifetime box-office gross earnings, and (iii) The paintings dataset contains 36 paintings with their latest auction prices. The dataset can be found here -https://github.com/amrit19/Surprisingly-Popular-Voting-Partial
Dataset Splits No The paper describes a human-subject crowdsourcing experiment and then evaluates aggregation algorithms on the collected data. It does not mention explicit training, validation, or test splits of a dataset in the context of machine learning model training or hyperparameter tuning.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory, cloud instances) used to run the experiments or simulations.
Software Dependencies No The paper mentions the use of "Stan [8]" for Bayesian inference but does not specify its version or the versions of any other software libraries or programming languages used for implementation.
Experiment Setup Yes Each participant was presented with a subset of 5 alternatives, selected based on an interalternative gap of 6 positions within the ground-truth ranking. We tested subset sizes of 4 to 6 and interalternative gaps of 3 to 8... For each combination of 12 subsets, 9 elicitation formats, and 3 domains, each question received 16 responses. ... In our experiments we use = 0.55 and = 0.1.