Independent Skill Transfer for Deep Reinforcement Learning

Authors: Qiangxing Tian, Guanchu Wang, Jinxin Liu, Donglin Wang, Yachen Kang

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments including three robotic tasks demonstrate the effectiveness and high efficiency of our proposed IST method in comparison to direct primitive-skill transfer and conventional reinforcement learning.
Researcher Affiliation Academia 1Zhejiang University, Hangzhou, China 2School of Engineering, Westlake University, Hangzhou, China {tianqiangxing, liujinxin, wangdonglin, kangyachen}@westlake.edu.cn, hegsns@gmail.com
Pseudocode Yes Algorithm 1 Learn independent skills (LIS)
Open Source Code Yes https://github.com/qxtian/Learning-Independent-SKills
Open Datasets Yes In the pre-training stage, we employ DIAYN to learn 6 primitive skills (|Z| = 6), which can be used for both IST and primitive skill transfer (PST). According to the source environment Half Cheetah-v3 [...] We regard Half Cheetah-v2 as the source environment, and construct 3 target environments: HCH, HCA and HCU by adding obstruction with adjustable size in the environment
Dataset Splits No Figure 6 plots the loss of independent skills ˆπθ(ˆa|s, ˆz) on training dataset and validation dataset. [...] It is observed that the proposed strategy sampling from the key subset S performs better, where the loss converges to 10 8 on both training and validation sets, which indicates the validity of observation and action collection in characterizing the primitive skills π(a|s, z).
Hardware Specification No The paper discusses simulated environments like 'Half Cheetah-v3' and 'Half Cheetah-v2', but it does not specify any hardware components (e.g., CPU, GPU models, memory) used for running the experiments or training the models.
Software Dependencies No The paper mentions algorithms and frameworks like 'SAC' and 'DIAYN' but does not provide specific software versions for any libraries or dependencies.
Experiment Setup Yes In the pre-training stage, we employ DIAYN to learn 6 primitive skills (|Z| = 6) [...] we consider to learn 3 independent skills (| ˆZ| = 3) for the generation of practical skills. [...] Furthermore, as the trajectory length T increases from 150 to 250, a notable improvement can be obtained. Hence, we keep T = 250 in the following experiments.