reproducibilityindex.ai

Offline Policy Learning via Skill-step Abstraction for Long-horizon Goal-Conditioned Tasks

Authors: Donghoon Kim, Minjong Yoo, Honguk Woo

IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through experiments with the maze and Franka kitchen environments, we demonstrate the superiority and efficiency of our GLVSA framework in adapting GC policies to a wide range of long-horizon goals. The framework achieves competitive zero-shot and few-shot adaptation performance, outperforming existing GC policy learning and skill-based methods.
Researcher Affiliation	Academia	Donghoon Kim1 , Minjong Yoo2 and Honguk Woo1,2 1Department of Artificial Intelligence, Sungkyunkwan University 2Department of Computer Science and Engineering, Sungkyunkwan University {qwef523, mjyoo2, hwoo}@skku.com,
Pseudocode	Yes	See Algorithm 1 in Appendix listing the iterative learning procedure for the offline training phase.
Open Source Code	No	The paper does not provide an explicit statement about releasing the source code or a link to a code repository for the described methodology.
Open Datasets	Yes	The experiment involves two environments from D4RL [Fu et al., 2020]: maze navigation (maze) and Franka kitchen simulation (kitchen). Both environments use offline datasets for training; i.e., 3,046 trajectories in the maze and 603 in the kitchen from [Pertsch et al., 2022] and [Fu et al., 2020], respectively.
Dataset Splits	No	The paper does not explicitly provide specific details on train/validation/test dataset splits (e.g., percentages or sample counts) used for their experiments.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., library names with specific versions like PyTorch 1.9, Python 3.8).
Experiment Setup	No	The paper does not explicitly provide specific experimental setup details such as concrete hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed training configurations in the main text.