Self-adaptive PSRO: Towards an Automatic Population-based Game Solver

Authors: Pengdeng Li, Shuxin Li, Chang Yang, Xinrun Wang, Xiao Huang, Hau Chan, Bo An

IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on various two-player zero-sum games demonstrate the superiority of SPSRO over different baselines.
Researcher Affiliation Collaboration 1Nanyang Technological University, Singapore 2Skywork AI, Singapore 3The Hong Kong Polytechnic University, Hong Kong SAR, China 4University of Nebraska-Lincoln, Lincoln, Nebraska, United States
Pseudocode Yes Algorithm 1 SPSRO
Open Source Code No The paper does not provide a direct link or explicit statement that the source code for the main methodology (SPSRO/Transformer) described in the paper is openly available. It only mentions
Open Datasets Yes We consider the following games. (1) Normal-form games (NFGs) of size |A1| |A2|. The payoff matrices are randomly sampled from the range [ 1, 1]. The set of size is {150 150, 200 200, 250 250, 100 200, 150 300}. (2) Extensive-form games (EFGs): Leduc, Goofspiel, Liar s Dice, Negotiation, and Tic-Tac-Toe, which are implemented in Open Spiel [Lanctot et al., 2019].
Dataset Splits No The paper mentions
Hardware Specification Yes All experiments are performed on a machine with a 24-core 3.2GHz Intel i9-12900K CPU and an NVIDIA RTX 3060 GPU, and the results are averaged over 30 independent runs.
Software Dependencies No The paper mentions software tools like
Experiment Setup Yes We generate the training datasets for NFGs and EFGs separately. For NFGs, we generate the dataset on the game of size |A1| |A2| = 200 200. For EFGs, we generate the dataset on the Leduc Poker. During testing, in addition to the games used to generate the dataset, we directly apply the trained Transformer model to the other games to verify the zero-shot generalization ability of the model.