A Game-Theoretic Framework for Managing Risk in Multi-Agent Systems
Authors: Oliver Slumbers, David Henry Mguni, Stefano B Blumberg, Stephen Marcus Mcaleer, Yaodong Yang, Jun Wang
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Theoretically and empirically, we show RAE shares many properties with a Nash Equilibrium (NE), establishing convergence properties and generalising to risk-dominant NE in certain cases. ... We empirically demonstrate the minimum reward variance benefits of RAE in matrix games with high-risk outcomes. Results on MARL experiments show RAE generalises to risk-dominant NE in a trust dilemma game and that it reduces instances of crashing by 7x in an autonomous driving setting versus the best performing baseline. |
| Researcher Affiliation | Collaboration | 1University College London, London, UK 2Huawei Technologies, London, UK 3Independent Researcher 4Peking University, Beijing, China. |
| Pseudocode | Yes | Appendix F. Pseudo-code includes 'Algorithm 1 SFP' and 'Algorithm 2 PSRO-RAE'. |
| Open Source Code | No | The paper does not include an explicit statement about releasing the code for the described methodology or provide a direct link to a code repository for their implementation. |
| Open Datasets | Yes | Our stag-hunt environment is taken from (Peysakhovich & Lerer, 2018)... Our driving environment is based on the two-way environment from (Leurent, 2018). |
| Dataset Splits | No | The paper mentions '50 episodes over 5 seeds for intra-distribution testing' for its experiments, but does not specify explicit training, validation, or test dataset splits (e.g., percentages or sample counts) for reproducibility. |
| Hardware Specification | Yes | All experiments run on one machine with: AMD Ryzen Threadripper 3960X 24 Core 1 x NVIDIA Ge Force RTX 3090 |
| Software Dependencies | Yes | PPO HYPERPARAMS DEFAULT SB3 (RAFFIN ET AL., 2021) from Table 1, and the reference 'Raffin, A., Hill, A., Gleave, A., Kanervisto, A., Ernestus, M., and Dormann, N. Stable-baselines3: Reliable reinforcement learning implementations. Journal of Machine Learning Research, 22(268):1 8, 2021.' |
| Experiment Setup | Yes | Appendix G. Hyperparameter Settings for our experiments provides detailed settings, including 'FP ITERATIONS 100', 'TREMBLE PROBABILITY 0.001', 'LEARNING RATE 0.005', 'RAE GAMMA 0.1, 0.5', among many others in Table 1. |