reproducibilityindex.ai

Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning

Authors: Junhyuk Oh, Satinder Singh, Honglak Lee, Pushmeet Kohli

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on a stochastic 3D domain show that the proposed ideas are crucial for generalization to longer instructions as well as unseen instructions.
Researcher Affiliation	Collaboration	1University of Michigan 2Google Brain 3Microsoft Research.
Pseudocode	Yes	Algorithm 1 Subtask update (Soft)
Open Source Code	No	The demo videos are available at the following website: https://sites.google.com/a/umich. edu/junhyuk-oh/task-generalization. This link is for demo videos, not source code for the methodology.
Open Datasets	No	We developed a 3D visual environment using Minecraft based on Oh et al. (2016) as shown in Figure 1. This describes a custom environment, and while it cites a paper, it does not provide concrete access information for the specific data used.
Dataset Splits	No	The paper mentions training, evaluation, and test sets, but does not provide specific percentages, sample counts, or clear predefined splits for training, validation, or test sets.
Hardware Specification	No	The paper mentions '16 CPU threads' but does not specify any particular CPU model, GPU, or other hardware components used for running experiments.
Software Dependencies	No	The paper refers to using 'actor-critic method' and 'LSTM' but does not specify any software names with version numbers (e.g., Python 3.x, PyTorch 1.x).
Experiment Setup	Yes	The network architecture of our parameterized skill consists of 4 convolution layers and one LSTM (Hochreiter and Schmidhuber, 1997) layer. We conducted curriculum training by changing the size of the world, the density of object and walls according to the agent s success rate. We implemented actor-critic method with 16 CPU threads based on Sukhbaatar et al. (2015). The parameters are updated after 8 episodes for each thread.