Diversity-Driven Extensible Hierarchical Reinforcement Learning
Authors: Yuhang Song, Jianyi Wang, Thomas Lukasiewicz, Zhenghua Xu, Mai Xu4992-4999
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental studies evaluate DEHRL with nine baselines from four perspectives in two domains; the results show that DEHRL outperforms the state-of-the-art baselines in all four aspects. |
| Researcher Affiliation | Academia | 1Department of Computer Science, University of Oxford, UK 2School of Electronic and Information Engineering, Beihang University, China 3State Key Laboratory of Reliability and Intelligence of Electrical Equipment, Hebei University of Technology, China |
| Pseudocode | No | No structured pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | Yes | Easy-to-run codes have been released to further clarify the details and facilitate future research1, where evaluations and visualizations on more domains, such as Montezuma s Revenge and Py Bullet (an open source alternative of Mu Jo Co), can also be found. 1https://github.com/Yuhang Song/DEHRL |
| Open Datasets | No | The paper uses game environments (Over Cooked, Minecraft) which are not explicitly referred to as publicly available datasets with concrete access information (link, citation, repository). |
| Dataset Splits | No | The paper does not provide specific dataset split information (e.g., percentages, sample counts, or references to predefined splits) for training, validation, or testing. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory amounts) used for running experiments were mentioned in the paper. |
| Software Dependencies | No | The paper mentions the PPO algorithm and deep neural networks but does not provide specific version numbers for software dependencies like programming languages, libraries, or frameworks. |
| Experiment Setup | Yes | The important hyper-parameters of DEHRL are summarized in Table 1, while other details (e.g., neural network architectures and hyper-parameters in the policy training algorithm) are provided in (Song et al. 2018a). Table 1: The settings of DEHRL. A0 A1 A2 T0 T1 T2 16 5 5 1 1*4 1*4*12 |