reproducibilityindex.ai

Safe Opponent-Exploitation Subgame Refinement

Authors: Mingyang Liu, Chengjie Wu, Qihan Liu, Yansen Jing, Jun Yang, Pingzhong Tang, Chongjie Zhang

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical results show that SES significantly outperforms NE baselines and previous algorithms while keeping exploitability low at the same time.
Researcher Affiliation	Academia	1Institute for Interdisciplinary Information Sciences, Tsinghua University 2Department of Automation, Tsinghua University
Pseudocode	Yes	The pseudocode of SES is shown in Appendix A.
Open Source Code	No	The code and licence of the code would be released upon the paper acceptance.
Open Datasets	Yes	Our experiment is done in Leduc Hold em [Southey et al., 2005] and Flop Hold em Poker (FHP) [Brown et al., 2019].
Dataset Splits	No	No explicit train/validation/test dataset splits are provided. The paper describes using Leduc Hold em and Flop Hold em Poker as experimental environments and how different types of opponents are generated.
Hardware Specification	Yes	We test it on Intel(R) Xeon(R) Platinum 8276L CPU @ 2.20GHz
Software Dependencies	No	No specific software dependencies with version numbers are mentioned in the paper.
Experiment Setup	Yes	In our experiments, we set the maximum number of CFR iterations to 10 million. The batch size for player 1 s strategy estimation is 50. The parameters in the learning rate schedule for the CFR algorithm are set to decay from 0.01 to 0.0001 over 10 million iterations... The estimation error is generated by adding Gaussian noise with zero mean and standard deviation of 0.1, 0.3, 0.6, 0.9, 1.2... We average results over 3 random seeds for opponent generation and 3 random seeds for blueprint generation.