reproducibilityindex.ai

PerfectDou: Dominating DouDizhu with Perfect Information Distillation

Authors: Guan Yang, Minghuan Liu, Weijun Hong, Weinan Zhang, Fei Fang, Guangjun Zeng, Yue Lin

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In experiments we show how and why Perfect Dou beats all existing AI programs, and achieves state-of-the-art performance.
Researcher Affiliation	Collaboration	1 Net Ease Games AI Lab, 2 Shanghai Jiao Tong University, 3 Carnegie Mellon University
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	Project page at https://github.com/Netease-Games-AI-Lab-Guangzhou/Perfect Dou/.
Open Datasets	Yes	We evaluate Perfect Dou against the following algorithms under the open-source RLCard Environment [30] and Daochen Zha, Kwei-Herng Lai, Yuanpu Cao, Songyi Huang, Ruzhe Wei, Junyu Guo, and Xia Hu. Rlcard: A toolkit for reinforcement learning in card games. ar Xiv preprint ar Xiv:1910.04376, 2019.
Dataset Splits	No	The paper mentions training data collected via self-play and distributed training, and evaluates performance on 10,000 randomly generated decks. However, it does not provide specific training/validation/test dataset splits (e.g., percentages, sample counts for each split, or explicit validation set definition).
Hardware Specification	Yes	All evaluations are conducted on a single core of Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz.
Software Dependencies	No	The paper mentions using specific algorithms and the RLCard environment but does not provide specific version numbers for any software dependencies (e.g., "Python 3.8, PyTorch 1.9").
Experiment Setup	Yes	To train Perfect Dou, we utilize Proximal Policy Optimization (PPO) [19] with Generalized Advantage Estimation (GAE) [18] by self-play in a distributed training system. And each worker will load the latest model after 24 (8 for each player) steps sampling.