Cooperative Multi-Agent Fairness and Equivariant Policies
Authors: Niko A. Grupen, Bart Selman, Daniel D. Lee9350-9359
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We refer to our multi-agent learning strategy as Fairness through Equivariance (Fair-E) and demonstrate its effectiveness empirically. We then introduce Fairness through Equivariance Regularization (Fair-ER) as a soft-constraint version of Fair-E and show that it reaches higher levels of utility than Fair-E and fairer outcomes than non-equivariant policies. |
| Researcher Affiliation | Academia | Niko A. Grupen1, Bart Selman1, Daniel D. Lee2 1 Cornell University, Ithaca, NY 2 Cornell Tech, New York, NY {nag83, bs54m, ddl46}@cornell.edu |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. |
| Open Datasets | No | The paper describes the pursuit-evasion game setup but does not provide concrete access information (specific link, DOI, repository name, or formal citation with authors/year) for a publicly available or open dataset corresponding to their specific experimental environment instance. |
| Dataset Splits | No | The paper does not provide specific dataset split information for validation (e.g., percentages, sample counts, or citations to predefined validation splits). |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions using DDPG but does not specify ancillary software details such as library or solver names with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | In each experiment, n=3 pursuer agents are trained in a decentralized manner (each following DDPG) for a total of 125,000 episodes, during which velocity is decreased from | vp| = 1.2 to | vp| = 0.4. The evader speed is fixed at | ve| = 1.0. ... J(φi) + λJeqv(φ1, ..., φi, ..., φn) (10) where λ is a fairness control parameter weighting the strength of equivariance. |