reproducibilityindex.ai

Hierarchical Reinforcement Learning for Zero-shot Generalization with Subtask Dependencies

Authors: Sungryull Sohn, Junhyuk Oh, Honglak Lee

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The experimental results on two 2D visual domains show that our agent can perform complex reasoning to ﬁnd a near-optimal way of executing the subtask graph and generalize well to the unseen subtask graphs. In the experiment, we investigated the following research questions:
Researcher Affiliation	Collaboration	Sungryull Sohn University of Michigan srsohn@umich.edu Junhyuk Oh University of Michigan junhyuk@google.com Honglak Lee Google Brain University of Michigan honglak@google.com
Pseudocode	Yes	Algorithm 1 Policy optimization
Open Source Code	No	The paper does not provide an explicit statement about releasing the source code for the described methodology or a link to a code repository.
Open Datasets	No	Mining domain: The set of subtasks and preconditions are hand-coded based on the crafting recipes in Minecraft, and used as a template to generate 640 random subtask graphs. Playground: We randomly generated 500 graphs for training and 2,000 graphs for testing. The datasets of subtask graphs are generated by the authors for their experiments and no public access information is provided.
Dataset Splits	No	For the Mining domain, the paper states: 'We used 200 for training and 440 for testing.' For the Playground domain, it states: 'We randomly generated 500 graphs for training and 2,000 graphs for testing.' No explicit validation split is mentioned.
Hardware Specification	No	The paper does not specify the particular hardware (e.g., GPU models, CPU types, or memory) used to run the experiments.
Software Dependencies	No	The paper mentions certain methods and frameworks (e.g., actor-critic, Maze Base), but it does not specify any software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x).
Experiment Setup	Yes	We used ηd=1e-4, ηc=3e-6 for distillation and ηac=1e-6, ηc=3e-7 for ﬁne-tuning in the experiment.