Winner Determination in Huge Elections with MapReduce
Authors: Theresa Csar, Martin Lackner, Reinhard Pichler, Emanuel Sallinger
AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this work, we apply the Map Reduce framework which has been specifically designed for dealing with big data to various versions of the winner determination problem. We obtain efficient and highly parallel algorithms and provide a theoretical analysis and experimental evaluation. |
| Researcher Affiliation | Academia | TU Wien, Austria University of Oxford, UK |
| Pseudocode | Yes | Algorithm 1 Schwartz Set; Algorithm 2 Map Vertex; Algorithm 3 Reduce Vertex; Algorithm 4 Smith Set; |
| Open Source Code | Yes | Our implementation is available as open-source software1, as is the code for generating the datasets and running the code on EMR. 1https://github.com/theresacsar/Big Voting |
| Open Datasets | Yes | The datasets used to generate our results will be provided as open data. |
| Dataset Splits | No | The paper uses synthetic datasets and discusses generating instances but does not specify train, validation, or test splits. It mentions generating "five different instances" for sparse graphs for averaging results, but this is not a formal split. |
| Hardware Specification | Yes | In the results reported here, we utilize m3.xlarge instances, but note that tests with other types of instances did not yield substantially different results with regard to scalability. We use up to 128 of such instances. |
| Software Dependencies | No | The paper states "Amazon EMR provides a managed Hadoop framework, an open-source implementation of Map Reduce." but does not provide specific version numbers for Hadoop or other software components. |
| Experiment Setup | Yes | A timeout of 60 minutes was used for this experiment. We see that the time incurred falls below 20 minutes once 1+16 EC2 instances are used, and below 15 minutes for 1+32 instances. The times for the denser (m2/10 edges) graphs is shown in Figure 6, also between 1000 to 7000 candidates, but with up to 1 + 128 EC2 instances. A timeout of 90 minutes was used for this experiment. |