Neutralizing Self-Selection Bias in Sampling for Sortition

Authors: Bailey Flanigan, Paul Gölz, Anupam Gupta, Ariel D. Procaccia

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5 Experiments We validate our qi estimation and sampling algorithm on pool data from Climate Assembly UK... For a synthetic population produced by extrapolating the real data, we show that our algorithm obtains fair end-to-end probabilities. As displayed in Fig. 1, the marginals produced by Phase I of our algorithm give each feature-value pair f, v an expected number of seats... To demonstrate that our algorithm s theoretical guarantees lead to realized improvements in individual fairness over the state-of-the-art, we re-run the experiment above, this time using the Sortition Foundation s greedy algorithm to select a panel from each generated pool.
Researcher Affiliation Academia Bailey Flanigan, Computer Science Department Carnegie Mellon University Paul Gölz Computer Science Department Carnegie Mellon University Anupam Gupta Computer Science Department Carnegie Mellon University Ariel D. Procaccia School of Engineering and Applied Sciences Harvard University
Pseudocode No The paper describes its algorithm in prose, divided into phases, but does not include structured pseudocode or clearly labeled algorithm blocks.
Open Source Code Yes Our implementation is based on Py Torch and Gurobi, runs on consumer hardware, and its code is available on github.
Open Datasets Yes For the background sample, we used the 2016 European Social Survey [19]... [19] NSD Norwegian Centre for Research Data. European Social Survey Round 8 Data, 2016. Data file edition 2.1.
Dataset Splits No The paper does not specify traditional training, validation, and test splits for its datasets, which are used for learning participation probabilities and simulating panel selection rather than standard machine learning model training.
Hardware Specification No The paper states that its implementation “runs on consumer hardware” but does not provide specific details such as exact GPU/CPU models or memory amounts used for the experiments.
Software Dependencies No The paper mentions that its implementation is based on “Py Torch and Gurobi” but does not specify their version numbers or other software dependencies with versions.
Experiment Setup No The paper provides details about the problem instance (e.g., panel size k=110, recipients r=60,000) and the estimated participation probabilities, but it does not specify concrete hyperparameters or system-level training settings for its algorithms (e.g., learning rates, batch sizes, epochs for the MLE process).