Temporal Induced Self-Play for Stochastic Bayesian Games
Authors: Weizhe Chen, Zihan Zhou, Yi Wu, Fei Fang
IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We test TISP-based algorithms in various games, including finitely repeated security games and a grid-world game. The results show that TISP-PG is more scalable than existing mathematical programming-based methods and significantly outperforms other learning-based methods. |
| Researcher Affiliation | Academia | 1Shanghai Jiao Tong University 2Shanghai Qi Zhi Institute 3Tsinghua University 4Carnegie Mellon University |
| Pseudocode | Yes | Algorithm 1 Temporal-Induced Self-Play; Algorithm 2 Compute Test-Time Strategy |
| Open Source Code | No | The paper does not provide any links to source code or explicitly state that the code for the methodology is being released or is available. |
| Open Datasets | No | The paper describes testing its algorithms in various "games" (Finitely Repeated Security Game, Exposing Game, Tagging Game) which are custom environments or adaptations, rather than standard, publicly available datasets with access information. No specific dataset links or formal citations (author, year) are provided for a publicly available dataset. |
| Dataset Splits | No | The paper does not specify explicit training, validation, and test dataset splits with percentages, sample counts, or references to predefined splits, as it focuses on training agents within game environments rather than on static datasets. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. It only mentions training times and sample counts. |
| Software Dependencies | No | The paper mentions using "deep reinforcement learning" but does not specify any software names with version numbers (e.g., Python 3.x, PyTorch x.x, CUDA x.x, specific libraries or solvers with versions). |
| Experiment Setup | No | The paper states "Full experiment details can be found in Appx. D." but this appendix is not provided in the main text. The main body of the paper does not contain specific hyperparameters (e.g., learning rate, batch size, number of epochs) or other detailed training configurations. |