Offline Policy Learning via Skill-step Abstraction for Long-horizon Goal-Conditioned Tasks
Authors: Donghoon Kim, Minjong Yoo, Honguk Woo
IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through experiments with the maze and Franka kitchen environments, we demonstrate the superiority and efficiency of our GLVSA framework in adapting GC policies to a wide range of long-horizon goals. The framework achieves competitive zero-shot and few-shot adaptation performance, outperforming existing GC policy learning and skill-based methods. |
| Researcher Affiliation | Academia | Donghoon Kim1 , Minjong Yoo2 and Honguk Woo1,2 1Department of Artificial Intelligence, Sungkyunkwan University 2Department of Computer Science and Engineering, Sungkyunkwan University {qwef523, mjyoo2, hwoo}@skku.com, |
| Pseudocode | Yes | See Algorithm 1 in Appendix listing the iterative learning procedure for the offline training phase. |
| Open Source Code | No | The paper does not provide an explicit statement about releasing the source code or a link to a code repository for the described methodology. |
| Open Datasets | Yes | The experiment involves two environments from D4RL [Fu et al., 2020]: maze navigation (maze) and Franka kitchen simulation (kitchen). Both environments use offline datasets for training; i.e., 3,046 trajectories in the maze and 603 in the kitchen from [Pertsch et al., 2022] and [Fu et al., 2020], respectively. |
| Dataset Splits | No | The paper does not explicitly provide specific details on train/validation/test dataset splits (e.g., percentages or sample counts) used for their experiments. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., library names with specific versions like PyTorch 1.9, Python 3.8). |
| Experiment Setup | No | The paper does not explicitly provide specific experimental setup details such as concrete hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed training configurations in the main text. |