Computing the Schulze Method for Large-Scale Preference Data Sets

Authors: Theresa Csar, Martin Lackner, Reinhard Pichler

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the effectiveness of our optimisations in an experimental evaluation. We use daily music charts provided by the Spotify application to generate data sets with up to 18,400 alternatives; the corresponding weighted tournament graphs have up to 160 million weighted edges. We show that such data sets can be computed in the matter of minutes and demonstrate that runtimes can be significantly reduced by an increase in parallelization. Thus, our algorithm enables the application of the Schulze method in data-intensive settings.
Researcher Affiliation Academia Theresa Csar, Martin Lackner, Reinhard Pichler TU Wien, Austria {csar, lackner, pichler}@dbai.tuwien.ac.at
Pseudocode Yes Algorithm 1 Schulze Winner Determination; Algorithm 2 Preprocessing; Algorithm 3 Forward-Backward-Propagation; Algorithm 4 Postprocessing for vertex c
Open Source Code Yes The source code of our implementation is part of the open-source project Cloud Voting7. 7https://github.com/theresacsar/Cloud Voting
Open Datasets Yes To this end, we use the Spotify ranking data5 of 2017, which consists of daily top-200 music rankings for 53 countries. 5https://spotifycharts.com/regional
Dataset Splits No The paper describes four datasets (Global150, Global200, Europe150, Europe200) that were generated from the Spotify data and used for evaluation. However, it does not specify any train/validation/test splits for these datasets, as the algorithm processes the entire datasets for computation rather than training a predictive model.
Hardware Specification Yes We ran our experiments on a Hadoop cluster with 18 nodes (each with an Intel Gold 5118 CPU, 12 cores, 2,3 GHz processor, 256 GB RAM, and a 10Gb/s network connection).
Software Dependencies No Our Schulze algorithm is implemented in the Scala programming language. Furthermore, we use the Graph X library6, which is built on top of Spark [Zaharia et al., 2010], an open-source cluster-computing engine. The paper mentions Scala, Graph X, and Spark but does not provide specific version numbers for these software components.
Experiment Setup No The paper describes the infrastructure setup for the experiments (e.g., number of nodes, cores), and general aspects of the algorithm's optimization and processing. However, it does not provide specific configurable experimental setup details such as hyperparameters (e.g., learning rates, batch sizes, convergence thresholds) that are typically adjusted for model training.