A Game-Theoretic Framework for Managing Risk in Multi-Agent Systems

Authors: Oliver Slumbers, David Henry Mguni, Stefano B Blumberg, Stephen Marcus Mcaleer, Yaodong Yang, Jun Wang

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Theoretically and empirically, we show RAE shares many properties with a Nash Equilibrium (NE), establishing convergence properties and generalising to risk-dominant NE in certain cases. ... We empirically demonstrate the minimum reward variance benefits of RAE in matrix games with high-risk outcomes. Results on MARL experiments show RAE generalises to risk-dominant NE in a trust dilemma game and that it reduces instances of crashing by 7x in an autonomous driving setting versus the best performing baseline.
Researcher Affiliation Collaboration 1University College London, London, UK 2Huawei Technologies, London, UK 3Independent Researcher 4Peking University, Beijing, China.
Pseudocode Yes Appendix F. Pseudo-code includes 'Algorithm 1 SFP' and 'Algorithm 2 PSRO-RAE'.
Open Source Code No The paper does not include an explicit statement about releasing the code for the described methodology or provide a direct link to a code repository for their implementation.
Open Datasets Yes Our stag-hunt environment is taken from (Peysakhovich & Lerer, 2018)... Our driving environment is based on the two-way environment from (Leurent, 2018).
Dataset Splits No The paper mentions '50 episodes over 5 seeds for intra-distribution testing' for its experiments, but does not specify explicit training, validation, or test dataset splits (e.g., percentages or sample counts) for reproducibility.
Hardware Specification Yes All experiments run on one machine with: AMD Ryzen Threadripper 3960X 24 Core 1 x NVIDIA Ge Force RTX 3090
Software Dependencies Yes PPO HYPERPARAMS DEFAULT SB3 (RAFFIN ET AL., 2021) from Table 1, and the reference 'Raffin, A., Hill, A., Gleave, A., Kanervisto, A., Ernestus, M., and Dormann, N. Stable-baselines3: Reliable reinforcement learning implementations. Journal of Machine Learning Research, 22(268):1 8, 2021.'
Experiment Setup Yes Appendix G. Hyperparameter Settings for our experiments provides detailed settings, including 'FP ITERATIONS 100', 'TREMBLE PROBABILITY 0.001', 'LEARNING RATE 0.005', 'RAE GAMMA 0.1, 0.5', among many others in Table 1.