Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Offline Policy Learning via Skill-step Abstraction for Long-horizon Goal-Conditioned Tasks

Authors: Donghoon Kim, Minjong Yoo, Honguk Woo

IJCAI 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through experiments with the maze and Franka kitchen environments, we demonstrate the superiority and efficiency of our GLVSA framework in adapting GC policies to a wide range of long-horizon goals. The framework achieves competitive zero-shot and few-shot adaptation performance, outperforming existing GC policy learning and skill-based methods.
Researcher Affiliation Academia Donghoon Kim1 , Minjong Yoo2 and Honguk Woo1,2 1Department of Artificial Intelligence, Sungkyunkwan University 2Department of Computer Science and Engineering, Sungkyunkwan University EMAIL,
Pseudocode Yes See Algorithm 1 in Appendix listing the iterative learning procedure for the offline training phase.
Open Source Code No The paper does not provide an explicit statement about releasing the source code or a link to a code repository for the described methodology.
Open Datasets Yes The experiment involves two environments from D4RL [Fu et al., 2020]: maze navigation (maze) and Franka kitchen simulation (kitchen). Both environments use offline datasets for training; i.e., 3,046 trajectories in the maze and 603 in the kitchen from [Pertsch et al., 2022] and [Fu et al., 2020], respectively.
Dataset Splits No The paper does not explicitly provide specific details on train/validation/test dataset splits (e.g., percentages or sample counts) used for their experiments.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., library names with specific versions like PyTorch 1.9, Python 3.8).
Experiment Setup No The paper does not explicitly provide specific experimental setup details such as concrete hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed training configurations in the main text.