reproducibilityindex.ai

Generating Adjacency-Constrained Subgoals in Hierarchical Reinforcement Learning

Authors: Tianren Zhang, Shangqi Guo, Tian Tan, Xiaolin Hu, Feng Chen

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on discrete and continuous control tasks show that incorporating the adjacency constraint improves the performance of state-of-the-art HRL approaches in both deterministic and stochastic environments.
Researcher Affiliation	Academia	Tianren Zhang ,1, Shangqi Guo ,1, Tian Tan2, Xiaolin Hu ,3,4,5, Feng Chen ,1,6,7 1 Department of Automation, Tsinghua University 2 Department of Civil and Environmental Engineering, Stanford University 3 Department of Computer Science and Technology, Tsinghua University 4 Beijing National Research Center for Information Science and Technology 5 State Key Laboratory of Intelligent Technology and Systems 6 Beijing Innovation Center for Future Chip 7 LSBDPA Beijing Key Laboratory
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	1Code is available at https://github.com/trzhang0116/HRAC.
Open Datasets	Yes	Continuous tasks include Ant Gather, Ant Maze and Ant Maze Sparse, where the ﬁrst two tasks are widely-used benchmarks in HRL community [6, 11, 26, 25, 22], and the third task is a more challenging navigation task with sparse rewards.
Dataset Splits	No	The paper describes training and testing procedures but does not explicitly provide specific details on dataset split percentages or sample counts for training, validation, and testing sets.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper mentions the Mu Jo Co simulator but does not provide its version number or any other specific software dependencies with their version numbers required to replicate the experiment.
Experiment Setup	Yes	Given the current state s and the subgoal generation frequency k, the high-level only needs to explore in a subset of subgoals covering states that the low-level can possibly reach within k steps. and H(x, k) = max(x/k 1, 0) is a hinge loss function and η is a balancing coefﬁcient. and where gi = ϕ(si), gj = ϕ(sj), and a hyper-parameter δ > 0 is used to create a gap between the embeddings.