reproducibilityindex.ai

A Framework for Engineering Human/Agent Teaming Systems

Authors: Rick Evertsz, John Thangarajah2477-2484

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To evaluate this hypothesis we conducted a study with human participants using our user interface for the Star Craft strategy game, which presents pertinent, instantiated TDF-T diagrams to the human at runtime. The performance of human participants in the study indicates that their ability to work in concert with the non-player characters in the game is signiﬁcantly enhanced by the timely presentation of a diagrammatic representation of team cognition.
Researcher Affiliation	Academia	Rick Evertsz, John Thangarajah RMIT University, Melbourne, Australia {rick.eversz, john.thagarajah}@rmit.edu.au
Pseudocode	No	The paper describes a framework and methodology using diagrams and prose but does not include any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper mentions using the SARL agent language and developing a proof-of-concept middleware and Java-based UI, but does not provide any link or explicit statement regarding the public availability of their source code.
Open Datasets	No	The paper describes a case study using a Star Craft testbed for human participants and evaluation checkpoints. It does not specify or provide access to a publicly available dataset in the conventional sense of machine learning training data.
Dataset Splits	No	The paper describes a user evaluation with two experimental conditions and seven checkpoints for performance evaluation. This is a human study design, not a conventional train/validation/test data split for machine learning models.
Hardware Specification	No	The paper describes a 'Star Craft/TDF-T testbed' and a 'Java-based UI' but provides no specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions 'TDF-T', 'SARL agent language (Rodriguez, Gaud, and Galland 2014)', and a 'Java-based UI'. While software components are named, specific version numbers for SARL or Java are not provided, which is necessary for reproducibility.
Experiment Setup	Yes	User Evaluation: To test this hypothesis, we developed the TDF-T/SARL/Star Craft testbed and case study described earlier. In order to successfully defend the Messenger, the participant must (i) maintain formation, (ii) not move to defend the attacked east ﬂank (the North agent is nearer), (iii) move to a north-west position to ﬁll the gap left by the North agent who is ﬁghting on the east ﬂank, (iv) ﬁght the attacker from the north, (v) move back west into formation, (vi) ﬁght the west attacker while moving east to keep the team in view, and (vii) ﬁght the enemy who attacks from the south-east. These seven checkpoints were the criteria by which the human s performance was evaluated. If the Messenger was killed at any point, the scenario was re-run from the checkpoint which comes immediately after the point where the participant failed in the previous run. In this way, each participant s performance was recorded for all seven checkpoints (see Figure 5 for a diagram showing all of the checkpoints apart from (i) and (v), which only relate to screen formation around the Messenger; their inclusion would unnecessarily complicate the diagram). Experiments: Two experimental conditions were evaluated. In the TDF-T condition, TDF-T diagrams were presented, whereas in the Baseline (non TDF-T) condition, no TDF-T diagrams were shown. From a pool of 16, eight participants were randomly allocated to each condition; all had a computer science background.