Control of Memory, Active Perception, and Action in Minecraft
Authors: Junhyuk Oh, Valliappa Chockalingam, Satinder, Honglak Lee
ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experimental results show that our new architectures generalize to unseen environments better than existing DRL architectures. and 5. Experiments The experiments, baselines, and tasks are designed to investigate how useful context-dependent memory retrieval is for generalizing to unseen maps, and when memory feedback connections in FRMQN are helpful. |
| Researcher Affiliation | Academia | Junhyuk Oh JUNHYUK@UMICH.EDU Valliappa Chockalingam VALLI@UMICH.EDU Satinder Singh BAVEJA@UMICH.EDU Honglak Lee HONGLAK@UMICH.EDU Computer Science & Engineering, University of Michigan |
| Pseudocode | No | The paper describes the architectures using equations and diagrams, such as "et = ϕenc (xt) (1)" and "pt,i = exp h t Mkey t [i] / PM j=1 exp h t Mkey t [j] (4)", but it does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | No | The paper mentions that its implementation is based on Torch7 and a public DQN implementation, and provides a link to "Game play videos" at "https://sites.google.com/a/umich.edu/junhyuk-oh/icml2016-minecraft". However, it does not provide concrete access to the source code for the methodology described in this paper. |
| Open Datasets | No | The paper describes creating custom tasks and maps within Minecraft, stating, "We then use these tasks to systematically compare and contrast existing deep reinforcement learning (DRL) architectures with our new memory-based DRL architectures." and "We generated 500 training and 500 unseen maps in such a way that there is little overlap between the two sets of visual patterns." While the environment is Minecraft, the specific task generation procedures and maps are not provided as a publicly accessible dataset with a direct link or citation. |
| Dataset Splits | No | The paper mentions using "training set of maps" and "unseen set of maps" for evaluation (e.g., "We generated 500 training and 500 unseen maps"). It also states, "For each run, we measured the average success rate of 10 best-performing parameters based on the performance on unseen set of maps." This implies a selection process, but there is no explicit mention of a distinct validation dataset split with specific percentages or sample counts. |
| Hardware Specification | No | The paper describes the software implementation details, such as "Our implementation is based on Torch7 (Collobert et al., 2011), a public DQN implementation (Mnih et al., 2015), and a Minecraft Forge Mod.", but it does not provide any specific hardware details like GPU or CPU models used for running the experiments. |
| Software Dependencies | No | The paper mentions key software components used: "Our implementation is based on Torch7 (Collobert et al., 2011), a public DQN implementation (Mnih et al., 2015), and a Minecraft Forge Mod." However, it does not provide specific version numbers for any of these dependencies, which is required for reproducibility. |
| Experiment Setup | Yes | Input frames from Minecraft are captured as 32 32 RGB images. and All the architectures use the same 2-layer CNN architecture as described in the supplementary material. and In the DQN and DRQN architectures, the last convolutional layer is followed by a fully-connected layer with 256 hidden units. and In addition, 256 LSTM units are used in DRQN, RMQN, and FRMQN. and An agent receives -0.04 reward at every time step. and The last 12 frames were given as input for all architectures, and the size of memory for our architectures was 11. |