Hierarchical Skills for Efficient Exploration
Authors: Jonas Gehring, Gabriel Synnaeve, Andreas Krause, Nicolas Usunier
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform an experimental analysis of our hierarchical pre-training on a new set of challenging sparse-reward tasks with simulated bipedal robots. Our experiments show that... and 5 Experimental Results |
| Researcher Affiliation | Collaboration | Jonas Gehring1,2 Gabriel Synnaeve1 Andreas Krause2 Nicolas Usunier1 1Facebook AI Research 2ETH Zürich |
| Pseudocode | Yes | Pseudo-code for the pre-training algorithm and further implementation details are provided in Appendix C. |
| Open Source Code | Yes | Code and videos are available at https://facebookresearch.github.io/hsd3. |
| Open Datasets | No | The paper introduces a 'new benchmark suite of diverse, sparse-reward tasks for bipedal robots' and states 'All environments are provided via a standard Gym interface [5] with accompanying open-source code, enabling easy use and re-use.' While the environments are open-sourced, they are referred to as tasks/environments, not a public dataset in the traditional sense, and no specific dataset link or citation is provided for a pre-existing dataset that was used for training. |
| Dataset Splits | No | The paper mentions training and evaluation but does not specify explicit train/validation/test splits, either by percentage, sample count, or reference to predefined standard splits. |
| Hardware Specification | Yes | Pre-training takes approximately 3 days on 2 GPUs (V100) |
| Software Dependencies | No | Mentions 'Mu Jo Co physics simulator [49, 48]', 'standard Gym interface [5]', 'Soft Actor-Critic (SAC) [18]', 'Adam [22]'. However, specific version numbers for these software dependencies are not provided in the main text. |
| Experiment Setup | No | Further details regarding the training setup, hyper-parameters for skill and high-level policy training, as well as for baselines, can be found in Appendix E. Since the question asks for details 'in the main text', and these details are deferred to an appendix, the answer is no. |