Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

DeltaDou: Expert-level Doudizhu AI through Self-play

Authors: Qiqi Jiang, Kuangzheng Li, Boyao Du, Hao Chen, Hai Fang

IJCAI 2019 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our results show that self-play can significantly improve the performance of our agent in this multiagent imperfect information game. Even starting with a weak AI, our agent can achieve human expert level after days of self-play and training.
Researcher Affiliation	Industry	Qiqi Jiang , Kuangzheng Li , Boyao Du , Hao Chen and Hai Fang Sweet Code Inc, Beijing EMAIL
Pseudocode	Yes	Algorithm 1 FPMCTS in Doudizhu
Open Source Code	No	The paper mentions an 'open source heuristics-based algorithm' (RHCP) from another source, but does not state that its own methodology's code is available or provide a link.
Open Datasets	No	In the first phase, 200,000 games were selfplayed by the heuristic algorithm, then the game results were used to generate the initial policy-value network under supervised learning.
Dataset Splits	No	The paper mentions a 'testing data set' of 100 games but does not specify a distinct validation set or provide explicit train/validation/test dataset splits for reproduction.
Hardware Specification	Yes	It took 2 months to train the network on 68 CPUs... It was ran on a single 8-core computer with the average time for a move of roughly 5 to 8 seconds.
Software Dependencies	No	The paper mentions various algorithms and frameworks (e.g., MCTS, neural networks, Alpha Zero-like framework) but does not provide specific version numbers for any software dependencies or libraries used.
Experiment Setup	Yes	In the first phase, 200,000 games were selfplayed by the heuristic algorithm... each episode contains 8000 games, and FPMCTS contains 400 playouts. Inference is used when any player has fewer than 15 cards in hand... The number of simulations in MCTS is set to 600 and c-p UCT is set to 2.