reproducibilityindex.ai

Investigating Human Priors for Playing Video Games

Authors: Rachit Dubey, Pulkit Agrawal, Deepak Pathak, Tom Griffiths, Alexei Efros

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Given a sample game, we conduct a series of ablation studies to quantify the importance of various priors on human performance. We do this by modifying the video game environment to systematically mask different types of visual information that could be used by humans as priors.
Researcher Affiliation	Academia	1University of California, Berkeley. Correspondence to: Rachit Dubey <rach0012@berkeley.edu>.
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Videos and the game manipulations are available at https://rach0012.github.io/human RL_website/.
Open Datasets	No	The paper describes creating a custom game environment and collecting data from human participants. It does not use a publicly available dataset for its primary human experiments, nor does it provide access information for the collected human play data.
Dataset Splits	No	The paper describes experiments with human participants and RL agents but does not specify dataset splits (training, validation, test) for machine learning model development. For human experiments, it states: 'Each participant was only allowed to complete a game once, and could not participate again (i.e. different 120 participants played each version of the game).'
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments or training the RL agent.
Software Dependencies	No	The paper mentions using 'A3C (Mnih et al., 2016)' and a 'curiosity-based RL algorithm speciﬁcally tailored to sparse-reward settings (Pathak et al., 2017)', but it does not specify any version numbers for these or other software libraries/dependencies.
Experiment Setup	Yes	We designed a browser-based platform game consisting of an agent sprite, platforms, ladders, angry pink object that kills the agent, spikes that are dangerous to jump on, a key, and a door (see Figure 2 (a)). The agent sprite can be moved with the help of arrow keys. A terminal reward of +1 is provided when the agent reaches the door after having to taken the key, thereby terminating the game. We quantiﬁed human performance on each version of the game by recruiting 120 participants from Amazon Mechanical Turk. Each participant was instructed to ﬁnish the game as quickly as possible using the arrow keys as controls, but no information about the goals or the reward structure of the game was communicated. Each participant was paid $1 for successfully completing the game. The maximum time allowed for playing the game was set to 30 minutes. For each game version, we report the mean performance of ﬁve random seeds that succeeded.