reproducibilityindex.ai

Learning Subgoal Representations with Slow Dynamics

Authors: Siyuan Li, Lulu Zheng, Jianhao Wang, Chongjie Zhang

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct experiments to compare our approach to existing state-of-the-art methods in HRL and in efficient exploration.
Researcher Affiliation	Academia	Institute for Interdisciplinary Information Sciences Tsinghua University, Beijing, China
Pseudocode	Yes	Algorithm 1 LESSON algorithm
Open Source Code	Yes	Find open-source code at https://github.com/Siyuan Lee/LESSON
Open Datasets	Yes	We compare LESSON with state-of-theart HRL and exploration methods on complex Mu Jo Co tasks (Todorov et al., 2012).
Dataset Splits	No	The paper evaluates performance during training but does not specify a separate validation dataset split or strategy for hyperparameter tuning.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory specifications used for running the experiments.
Software Dependencies	No	The paper mentions software like SAC, Adam optimizer, and MuJoCo, but does not provide specific version numbers for any of these components or other libraries.
Experiment Setup	Yes	Discount factor γ = 0.99 for both levels. Adam optimizer; learning rate 0.0002. Soft update targets τ = 0.005 for both levels. Replay buffer of size 1e6 for both levels. Reward scaling of 0.1 for both levels. Entropy coefﬁcient of SAC α = 0.2 for both levels. Low-level policy length c = 10 for the Point robot and c = 20 for the Ant robot except for the Ant Push task. In the Ant Push task, c = 50. Subgoal dimension of size 2.