Modelling Bounded Rationality in Multi-Agent Interactions by Generalized Recursive Reasoning
Authors: Ying Wen, Yaodong Yang, Jun Wang
IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We contribute both theoretically and empirically. On the theory side, we devise the hierarchical framework of GR2 through probabilistic graphical models and prove the existence of a perfect Bayesian equilibrium. ... On the empirical side, we validate our findings on a variety of MARL benchmarks. Precisely, we first illustrate the hierarchical thinking process on the Keynes Beauty Contest, and then demonstrate significant improvements compared to state-of-the-art opponent modeling baselines on the normal-form games and the cooperative navigation benchmark. |
| Researcher Affiliation | Collaboration | Ying Wen1 , Yaodong Yang1,2 , Jun Wang1 1University College London 2Huawei Research & Development U.K. {ying.wen, yaodong.yang, jun.wang}@cs.ucl.ac.uk |
| Pseudocode | Yes | Algorithm 1 GR2 Soft Actor-Critic Algorithm |
| Open Source Code | Yes | The experiment code and appendix are available at https://github. com/ying-wen/gr2 |
| Open Datasets | Yes | We start the experiments1 by elaborating how the GR2 model works on Keynes Beauty Contest, and then move onto the normal-form games that have non-trivial equilibria where common MARL methods fail to converge. Finally, we test on the navigation task that requires effective opponent modeling. ... We test the GR2 methods in more complexed Particle World environments [Lowe et al., 2017] |
| Dataset Splits | No | The paper mentions evaluating on benchmarks and using self-play, but it does not specify explicit training, validation, or test dataset splits (e.g., percentages or absolute counts) in the main text. It defers some details to an appendix: "We leave the detailed hyper-parameter settings and ablation studies in Appendix F due to space limit." |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU models, CPU models, or cloud computing instance types used for running the experiments. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., Python 3.x, TensorFlow 2.x, PyTorch 1.x). |
| Experiment Setup | Yes | We denote k as the highest level of reasoning in GR2-L/M, and adopt k = {1, 2, 3}, λ = 1.5. All results are reported with 6 random seeds. |