Reuse of Neural Modules for General Video Game Playing
Authors: Alexander Braylan, Mark Hollenbeck, Elliot Meyerson, Risto Miikkulainen
AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | GRUSM-ESP was evaluated in a stochastic version of the Atari 2600 general video game-playing platform using the Arcade Learning Environment simulator (ALE; Bellemare et al. 2013). This approach is more general than previous approaches to neural transfer for reinforcement learning. It is domain-agnostic and requires no prior assumptions about the nature of task relatedness or mappings. The method is analyzed in a stochastic version of the Arcade Learning Environment, demonstrating that it improves performance in some of the more complex Atari 2600 games, and that the success of transfer can be predicted based on a high-level characterization of game dynamics. |
| Researcher Affiliation | Academia | Alexander Braylan, Mark Hollenbeck, Elliot Meyerson, Risto Miikkulainen Department of Computer Science, The University of Texas at Austin {braylan,mhollen,ekm,risto}@cs.utexas.edu |
| Pseudocode | No | The paper describes the GRUSM-ESP architecture and process textually and with a diagram, but it does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statements about releasing its source code for the described methodology, nor does it include a link to a code repository. |
| Open Datasets | Yes | GRUSM-ESP was evaluated in a stochastic version of the Atari 2600 general video game-playing platform using the Arcade Learning Environment simulator (ALE; Bellemare et al. 2013). |
| Dataset Splits | No | The paper describes its experimental setup including run generations and evaluation trials, and uses a leave-one-out cross-validation for its analysis model, but it does not provide specific training/test/validation dataset splits (e.g., percentages or counts) for the main agent learning process. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments, such as GPU or CPU models, or memory specifications. |
| Software Dependencies | No | The paper mentions using the 'Arcade Learning Environment simulator (ALE; Bellemare et al. 2013)' and an 'ϵ-repeat action approach as suggested by Hausknecht and Stone (2015)', but it does not provide specific version numbers for these or any other software dependencies needed for replication. |
| Experiment Setup | Yes | Each run lasted 200 generations with 100 evaluations per generation. To interface with ALE, the output layer of each network consists of a 3x3 substrate representing the nine directional movements of the Atari joystick in addition to a single node representing the Fire button. The input layer consisted of a series of object representations manually generated as previously described by Hausknecht et al. (2013). The location of each object on the screen was represented in an 8x10 input substrate corresponding to the object s class. The numbers of object classes varied between one and four. All hidden and output neurons use a hyperbolic tangent activation function. Networks include a single hidden layer, and include recurrent self loops on hidden nodes; they are otherwise feedforward. |