Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

General Game Playing with Imperfect Information

Authors: Michael Schofield, Michael Thielscher

JAIR 2019 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We examine the design choices for the technique, show its soundness and completeness then provide some experimental results and demonstrate the use of the technique in a variety of imperfect-information games, revealing its strengths, weaknesses, and its efficiency against randomly generating samples. Improving the technique, we present Hyper Play-II, capable of correctly valuing informationgathering moves. Again, we provide some experimental results and demonstrate the use of the new technique revealing its strengths, weaknesses and its limitations.
Researcher Affiliation Academia Michael Schofield EMAIL Michael Thielscher EMAIL School of Computer Science and Engineering UNSW Sydney, Australia
Pseudocode Yes A formalism for an implementation of the technique is presented and tested to reveal its strengths, weaknesses, limitations, and its efficiency over randomly generated samples. The technique is proven to be sound and complete for all imperfect-information games in General Game Playing. A formalism for an implementation of the technique is presented and tested to reveal its strengths, weaknesses, limitations, and its efficiency over randomly generated samples. The technique is proven to be sound and complete for all imperfect-information games in General Game Playing. ... 3.8 Pseudo Code Below is a presentation of the original process with some alterations to the nomenclature. ... The Hyper Play algorithm is summarized in Figure 4 as part of an imperfect-information game player in Figure 3.
Open Source Code No The paper does not provide concrete access to source code for the methodology described. It refers to the electronic appendix for experimental results and game topologies, but not for code. No specific repository link or explicit code release statement is made.
Open Datasets No The paper does not provide concrete access information (specific link, DOI, repository name, or formal citation with authors/year to a dataset file) for the games used. It mentions using 'games available within the GGP community' and 'newly converted security games' but does not provide direct links or specific citations to these game definitions as if they were datasets. While the Monty Hall game is cited (Rosenhouse, 2009), this citation refers to a book on the problem, not a dataset of game states or rules used in the experiments.
Dataset Splits No The paper focuses on general game playing and simulations, not on fixed datasets with traditional training/test/validation splits. It describes experimental runs and batches of games but does not specify any dataset split percentages or sample counts for reproduction, as the 'data' is generated dynamically through game playouts.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, memory amounts) used for running its experiments. It only generally refers to 'computational resources'.
Software Dependencies No The paper discusses the Game Description Language (GDL) and its extension GDL-II, which are frameworks for defining games. However, it does not specify any particular software, libraries, or programming languages with version numbers that were used to implement the Hyper Play techniques or run the experiments.
Experiment Setup No The paper describes the general approach to experiments, such as using 'a simple Monte Carlo player' and setting 'resources for each role... so that it plays at well below the optimal level'. It mentions game variants like '5x5 version' vs. '8x8 format' for Blind Breakthrough. However, it lacks specific numerical hyperparameters, exact resource limits, or other concrete system-level training settings needed for reproduction (e.g., learning rates, batch sizes, number of playouts, CPU time limits per move).