reproducibilityindex.ai

Policy Aggregation

Authors: Parand A. Alamdari, Soroush Ebadian, Ariel D. Procaccia

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, our experiments in Section 7 evaluate the policies returned by different rules based on their fairness; the results identify quantile fairness as especially appealing. The experiments also illustrate the advantage of our approach over rules that optimize measures of social welfare (which are sensitive to affine transformations of the rewards).
Researcher Affiliation	Academia	Parand A. Alamdari University of Toronto & Vector Institute parand@cs.toronto.edu Soroush Ebadian University of Toronto soroush@cs.toronto.edu Ariel D. Procaccia Harvard University arielpro@seas.harvard.edu
Pseudocode	Yes	ALGORITHM 1: Seq. ϵ-Prop. Veto Core [7] ALGORITHM 2: ϵ-Max Quantile Fairness Procedure ALGORITHM 3: α-Approvals MILP ALGORITHM 4: ϵ-Borda count MILP
Open Source Code	Yes	The code for the experiments is available at https://github.com/praal/policy-aggregation.
Open Datasets	Yes	We adapt the dynamic attention allocation environment introduced by D Amour et al. [11].
Dataset Splits	No	The paper does not explicitly provide training/validation/test dataset splits with specific percentages or counts. It describes an environment and policy sampling for evaluation, but not data partitioning for model training in a supervised learning sense.
Hardware Specification	Yes	Experiments are all done on an AMD EPYC 7502 32-Core Processor with 258Gi B system memory. We use Gurobi [18] to solve LPs and MILPs.
Software Dependencies	Yes	We use Gurobi [18] to solve LPs and MILPs.
Experiment Setup	Yes	We sample 5 * 10^5 random policies based on which we fit a generalized logistic function to estimate the cdf of the expected return distribution Fi (Definition 4) for every agent. The policies for α-approval voting rules are optimized with respect to maximum utilitarian welfare. The egalitarian rule finds a policy that maximizes the expected return of the worst-off agent, then optimizes for the second worst-off agent, and so on. The implementation details of Borda count are in Appendix D.