Outcome-directed Reinforcement Learning by Uncertainty \& Temporal Distance-Aware Curriculum Goal Generation

Authors: Daesol Cho, Seungjae Lee, H. Jin Kim

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate that our algorithm significantly outperforms these prior methods in a variety of challenging navigation tasks and robotic manipulation tasks in a quantitative and qualitative way.1...5 EXPERIMENTS We include 6 environments to validate our proposed method...5.1 EXPERIMENTAL RESULTS...5.2 ABLATION STUDY
Researcher Affiliation Academia Daesol Cho , Seungjae Lee , H. Jin Kim Seoul National University, Automation and Systems Research Institute (ASRI), Artificial Intelligence Institute of Seoul National University (AIIS) {dscho1234, ysz0301, hjinkim}@snu.ac.kr
Pseudocode Yes The overall training process is summarized in Algorithm 1 in Appendix B. Algorithm 2 Meta-NML (Li et al., 2021)
Open Source Code Yes 1Code is available : https://github.com/jayLEE0301/outpace_official
Open Datasets Yes We referred to the metaworld (Yu et al., 2020) and EARL (Sharma et al., 2021) environments.
Dataset Splits No The paper provides 'Meta-learner train sample size' and 'Meta-learner test sample size' for a component of the method, but it does not specify explicit training/validation/test splits (e.g., percentages or counts) for the overall dataset used in the RL experiments.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory) used for running the experiments.
Software Dependencies No The paper mentions 'PyTorch (Paszke et al., 2019)' but does not specify a version number for PyTorch or any other software dependencies with their versions.
Experiment Setup Yes Table 2: Hyperparameters for OUTPACE...Table 3: Env-specific hyperparameters for OUTPACE