Investigating Human Priors for Playing Video Games
Authors: Rachit Dubey, Pulkit Agrawal, Deepak Pathak, Tom Griffiths, Alexei Efros
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Given a sample game, we conduct a series of ablation studies to quantify the importance of various priors on human performance. We do this by modifying the video game environment to systematically mask different types of visual information that could be used by humans as priors. |
| Researcher Affiliation | Academia | 1University of California, Berkeley. Correspondence to: Rachit Dubey <rach0012@berkeley.edu>. |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Videos and the game manipulations are available at https://rach0012.github.io/human RL_website/. |
| Open Datasets | No | The paper describes creating a custom game environment and collecting data from human participants. It does not use a publicly available dataset for its primary human experiments, nor does it provide access information for the collected human play data. |
| Dataset Splits | No | The paper describes experiments with human participants and RL agents but does not specify dataset splits (training, validation, test) for machine learning model development. For human experiments, it states: 'Each participant was only allowed to complete a game once, and could not participate again (i.e. different 120 participants played each version of the game).' |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments or training the RL agent. |
| Software Dependencies | No | The paper mentions using 'A3C (Mnih et al., 2016)' and a 'curiosity-based RL algorithm specifically tailored to sparse-reward settings (Pathak et al., 2017)', but it does not specify any version numbers for these or other software libraries/dependencies. |
| Experiment Setup | Yes | We designed a browser-based platform game consisting of an agent sprite, platforms, ladders, angry pink object that kills the agent, spikes that are dangerous to jump on, a key, and a door (see Figure 2 (a)). The agent sprite can be moved with the help of arrow keys. A terminal reward of +1 is provided when the agent reaches the door after having to taken the key, thereby terminating the game. We quantified human performance on each version of the game by recruiting 120 participants from Amazon Mechanical Turk. Each participant was instructed to finish the game as quickly as possible using the arrow keys as controls, but no information about the goals or the reward structure of the game was communicated. Each participant was paid $1 for successfully completing the game. The maximum time allowed for playing the game was set to 30 minutes. For each game version, we report the mean performance of five random seeds that succeeded. |