reproducibilityindex.ai

Winner Determination in Huge Elections with MapReduce

Authors: Theresa Csar, Martin Lackner, Reinhard Pichler, Emanuel Sallinger

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this work, we apply the Map Reduce framework which has been speciﬁcally designed for dealing with big data to various versions of the winner determination problem. We obtain efﬁcient and highly parallel algorithms and provide a theoretical analysis and experimental evaluation.
Researcher Affiliation	Academia	TU Wien, Austria University of Oxford, UK
Pseudocode	Yes	Algorithm 1 Schwartz Set; Algorithm 2 Map Vertex; Algorithm 3 Reduce Vertex; Algorithm 4 Smith Set;
Open Source Code	Yes	Our implementation is available as open-source software1, as is the code for generating the datasets and running the code on EMR. 1https://github.com/theresacsar/Big Voting
Open Datasets	Yes	The datasets used to generate our results will be provided as open data.
Dataset Splits	No	The paper uses synthetic datasets and discusses generating instances but does not specify train, validation, or test splits. It mentions generating "five different instances" for sparse graphs for averaging results, but this is not a formal split.
Hardware Specification	Yes	In the results reported here, we utilize m3.xlarge instances, but note that tests with other types of instances did not yield substantially different results with regard to scalability. We use up to 128 of such instances.
Software Dependencies	No	The paper states "Amazon EMR provides a managed Hadoop framework, an open-source implementation of Map Reduce." but does not provide specific version numbers for Hadoop or other software components.
Experiment Setup	Yes	A timeout of 60 minutes was used for this experiment. We see that the time incurred falls below 20 minutes once 1+16 EC2 instances are used, and below 15 minutes for 1+32 instances. The times for the denser (m2/10 edges) graphs is shown in Figure 6, also between 1000 to 7000 candidates, but with up to 1 + 128 EC2 instances. A timeout of 90 minutes was used for this experiment.