Independent Skill Transfer for Deep Reinforcement Learning
Authors: Qiangxing Tian, Guanchu Wang, Jinxin Liu, Donglin Wang, Yachen Kang
IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments including three robotic tasks demonstrate the effectiveness and high efficiency of our proposed IST method in comparison to direct primitive-skill transfer and conventional reinforcement learning. |
| Researcher Affiliation | Academia | 1Zhejiang University, Hangzhou, China 2School of Engineering, Westlake University, Hangzhou, China {tianqiangxing, liujinxin, wangdonglin, kangyachen}@westlake.edu.cn, hegsns@gmail.com |
| Pseudocode | Yes | Algorithm 1 Learn independent skills (LIS) |
| Open Source Code | Yes | https://github.com/qxtian/Learning-Independent-SKills |
| Open Datasets | Yes | In the pre-training stage, we employ DIAYN to learn 6 primitive skills (|Z| = 6), which can be used for both IST and primitive skill transfer (PST). According to the source environment Half Cheetah-v3 [...] We regard Half Cheetah-v2 as the source environment, and construct 3 target environments: HCH, HCA and HCU by adding obstruction with adjustable size in the environment |
| Dataset Splits | No | Figure 6 plots the loss of independent skills ˆπθ(ˆa|s, ˆz) on training dataset and validation dataset. [...] It is observed that the proposed strategy sampling from the key subset S performs better, where the loss converges to 10 8 on both training and validation sets, which indicates the validity of observation and action collection in characterizing the primitive skills π(a|s, z). |
| Hardware Specification | No | The paper discusses simulated environments like 'Half Cheetah-v3' and 'Half Cheetah-v2', but it does not specify any hardware components (e.g., CPU, GPU models, memory) used for running the experiments or training the models. |
| Software Dependencies | No | The paper mentions algorithms and frameworks like 'SAC' and 'DIAYN' but does not provide specific software versions for any libraries or dependencies. |
| Experiment Setup | Yes | In the pre-training stage, we employ DIAYN to learn 6 primitive skills (|Z| = 6) [...] we consider to learn 3 independent skills (| ˆZ| = 3) for the generation of practical skills. [...] Furthermore, as the trajectory length T increases from 150 to 250, a notable improvement can be obtained. Hence, we keep T = 250 in the following experiments. |