Playing FPS Games with Deep Reinforcement Learning
Authors: Guillaume Lample, Devendra Singh Chaplot
AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show that the proposed architecture substantially outperforms built-in AI agents of the game as well as average humans in deathmatch scenarios. |
| Researcher Affiliation | Academia | Guillaume Lample, Devendra Singh Chaplot {glample,chaplot}@cs.cmu.edu School of Computer Science Carnegie Mellon University |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. |
| Open Datasets | Yes | We use the Vi ZDoom platform (Kempka et al. 2016) to conduct all our experiments and evaluate our methods on the deathmatch scenario. |
| Dataset Splits | No | The paper specifies training and testing maps but does not mention a separate validation set. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers. |
| Experiment Setup | Yes | All networks were trained using the RMSProp algorithm and minibatches of size 32. Network weights were updated every 4 steps, so experiences are sampled on average 8 times during the training (Van Hasselt, Guez, and Silver 2015). The replay memory contained the one million most recent frames. The discount factor was set to γ = 0.99. We used an ϵ-greedy policy during the training, where ϵ was linearly decreased from 1 to 0.1 over the first million steps, and then fixed to 0.1. We used a 16/9 resolution of 440x225 which we resized to 108x60. |