reproducibilityindex.ai

Hierarchical Few-Shot Imitation with Skill Transition Models

Authors: Kourosh Hakhamaneshi, Ruihan Zhao, Albert Zhan, Pieter Abbeel, Michael Laskin

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In the experiments we are interested in answering the following questions: (i) Can our method successfully imitate unseen long-horizon downstream demonstrations? (ii) What is the importance of semi-parametric approach vs. future conditioning? (iii) Is pre-training and ﬁne-tuning the skill embedding model necessary for achieving high success rate? and 4.2 RESULTS, 4.3 ABLATION STUDIES
Researcher Affiliation	-1	Anonymous authors Paper under double-blind review
Pseudocode	Yes	Algorithm 1 FIST: Evaluation Algorithm
Open Source Code	Yes	Our codebase builds upon the SPi RL released code and is located at https://anonymous.4open.science/r/fist-C5DF/README.md.
Open Datasets	Yes	The data for Point Maze is collected using the same scripts provided in the D4RL dataset repository Fu et al. (2020). and The data for Ant Maze is solely based on the "ant-large-diverse-v0" dataset in D4RL.
Dataset Splits	No	The paper describes training and fine-tuning procedures but does not explicitly provide training/validation/test dataset splits needed for reproduction. It mentions fine-tuning on 'Ddemo' which consists of 10 expert trajectories, but this is not explicitly called a validation set or part of a formal split.
Hardware Specification	Yes	The training for both skill extraction and ﬁne-tuning were done on a single NVIDIA 2080Ti GPU.
Software Dependencies	No	The paper mentions using Adam optimizer with specific parameters, but does not provide version numbers for programming languages or major libraries used in the implementation.
Experiment Setup	Yes	Hyperparameters used for training are listed in Table 3. Hyperparameter Value Contrastive Distance Metric Encoder output dim 32 Encoder Hidden Layers 128 Encoder # Hidden Layers 2 Optimizer Adam(β1 = 0.9, β2 = 0.999, LR=1e-3) Skill extraction Epochs 200 Batch size 128 Optimizer Adam(β1 = 0.9, β2 = 0.999, LR=1e-3) H (sub-trajectory length) 10 β 5e-4 (Kitchen), 1e-2 (Maze) Skill Encoder dim-Z in VAE 128 hidden dim 128 # LSTM Layers 1 Skill Decoder hidden dim 128 # hidden layers 5 Inverse Skill Dynamic Model hidden dim 128 # hidden layers 5