Nonparametric Detection of Gerrymandering in Multiparty Plurality Elections

Authors: Dariusz Stolicki, Wojciech Słomczyński, Stanisław Szufa

IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We address this problem by testing our method on a large sample of simulated elections, consisting both of fair districting plans, drawn at random with a distribution intended to approximate the uniform distribution on the set of all admissible plans, and of unfair plans generated by an optimalization algorithm designed to maximize one party s vote share. [...] On the experimental dataset, our method achieves precision of .896 and recall of .912. We have tested our method on data from four sets of elections: 1. D14, 2014 Polish municipal elections (2412 instances), 2. D18, 2018 Polish municipal elections (2145 instances), 3. DU, U.S. House elections, 1900-2022 (2848 instances), 4. DN, national legislative elections from 15 countries (206 instances) [Kollman et al., 2023].
Researcher Affiliation Academia Dariusz Stolicki1 , Wojciech Słomczy nski1 and Stanisław Szufa2,3 1Jagiellonian University, Center for Quantitative Political Science, Krak ow 2AGH University, Krak ow 3CNRS, LAMSADE, Universit e Paris Dauphine PSL
Pseudocode No The paper defines mathematical models and problem formulations (e.g., 'Model 3.1', 'Model 3.2', 'Problem 4.1') and describes algorithmic steps in prose, but it does not include any structured pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper does not provide an explicit statement about releasing code for its methodology, nor does it include a direct link to a code repository. It cites an arXiv preprint [Słomczy nski et al., 2024], which is typically a paper repository, not a code repository.
Open Datasets Yes We have tested our method on data from four sets of elections: 1. D14, 2014 Polish municipal elections (2412 instances), 2. D18, 2018 Polish municipal elections (2145 instances), 3. DU, U.S. House elections, 1900-2022 (2848 instances), 4. DN, national legislative elections from 15 countries (206 instances) [Kollman et al., 2023].
Dataset Splits Yes We still need to choose two hyperparameters of the model: The scaling vector h0 and the nearest-neighbor parameter k. This we do using leave-one-out cross-validation [Li and Racine, 2004; H ardle et al., 1988] with the objective function defined as the Kullback-Leibler [Kullback and Leibler, 1951] divergence between the predicted and actual value vectors, together with an optimization algorithm by Hurvich et al. [1998] which penalizes high-variance bandwidths (with variance measured as the trace of the parameter matrix) similarly to the Akaike information criterion [Akaike, 1974].
Hardware Specification No The paper does not provide any specific details about the hardware used for running experiments, such as GPU models, CPU types, or other computing resource specifications.
Software Dependencies No The paper mentions various methods and algorithms (e.g., 'support vector machine-based classifier', 'Monte Carlo algorithms', 'integer linear programming algorithm') but does not specify any particular software libraries, frameworks, or solvers along with their version numbers.
Experiment Setup Yes We fix the number of districts to be created at 80 in the French and Polish case, and 50 in the German and municipal cases. [...] We still need to choose two hyperparameters of the model: The scaling vector h0 and the nearest-neighbor parameter k. This we do using leave-one-out cross-validation [Li and Racine, 2004; H ardle et al., 1988] with the objective function defined as the Kullback-Leibler [Kullback and Leibler, 1951] divergence between the predicted and actual value vectors, together with an optimization algorithm by Hurvich et al. [1998] which penalizes high-variance bandwidths (with variance measured as the trace of the parameter matrix) similarly to the Akaike information criterion [Akaike, 1974]. [...] For m > 3, the decision boundary is determined on the basis of the data using a support vector machine-based classifier [Boser et al., 1992; Cortes and Vapnik, 1995] with a third-order polynomial kernel, and then approximated by a strictly decreasing B-spline of degree 3, with boundary nodes at 1/m and 1/2 and interior nodes fitted using cross-validation.