Modelling Bounded Rationality in Multi-Agent Interactions by Generalized Recursive Reasoning

Authors: Ying Wen, Yaodong Yang, Jun Wang

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We contribute both theoretically and empirically. On the theory side, we devise the hierarchical framework of GR2 through probabilistic graphical models and prove the existence of a perfect Bayesian equilibrium. ... On the empirical side, we validate our findings on a variety of MARL benchmarks. Precisely, we first illustrate the hierarchical thinking process on the Keynes Beauty Contest, and then demonstrate significant improvements compared to state-of-the-art opponent modeling baselines on the normal-form games and the cooperative navigation benchmark.
Researcher Affiliation Collaboration Ying Wen1 , Yaodong Yang1,2 , Jun Wang1 1University College London 2Huawei Research & Development U.K. {ying.wen, yaodong.yang, jun.wang}@cs.ucl.ac.uk
Pseudocode Yes Algorithm 1 GR2 Soft Actor-Critic Algorithm
Open Source Code Yes The experiment code and appendix are available at https://github. com/ying-wen/gr2
Open Datasets Yes We start the experiments1 by elaborating how the GR2 model works on Keynes Beauty Contest, and then move onto the normal-form games that have non-trivial equilibria where common MARL methods fail to converge. Finally, we test on the navigation task that requires effective opponent modeling. ... We test the GR2 methods in more complexed Particle World environments [Lowe et al., 2017]
Dataset Splits No The paper mentions evaluating on benchmarks and using self-play, but it does not specify explicit training, validation, or test dataset splits (e.g., percentages or absolute counts) in the main text. It defers some details to an appendix: "We leave the detailed hyper-parameter settings and ablation studies in Appendix F due to space limit."
Hardware Specification No The paper does not provide any specific hardware details such as GPU models, CPU models, or cloud computing instance types used for running the experiments.
Software Dependencies No The paper does not list specific software dependencies with version numbers (e.g., Python 3.x, TensorFlow 2.x, PyTorch 1.x).
Experiment Setup Yes We denote k as the highest level of reasoning in GR2-L/M, and adopt k = {1, 2, 3}, λ = 1.5. All results are reported with 6 random seeds.