Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Independent Skill Transfer for Deep Reinforcement Learning
Authors: Qiangxing Tian, Guanchu Wang, Jinxin Liu, Donglin Wang, Yachen Kang
IJCAI 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments including three robotic tasks demonstrate the effectiveness and high efficiency of our proposed IST method in comparison to direct primitive-skill transfer and conventional reinforcement learning. |
| Researcher Affiliation | Academia | 1Zhejiang University, Hangzhou, China 2School of Engineering, Westlake University, Hangzhou, China EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1 Learn independent skills (LIS) |
| Open Source Code | Yes | https://github.com/qxtian/Learning-Independent-SKills |
| Open Datasets | Yes | In the pre-training stage, we employ DIAYN to learn 6 primitive skills (|Z| = 6), which can be used for both IST and primitive skill transfer (PST). According to the source environment Half Cheetah-v3 [...] We regard Half Cheetah-v2 as the source environment, and construct 3 target environments: HCH, HCA and HCU by adding obstruction with adjustable size in the environment |
| Dataset Splits | No | Figure 6 plots the loss of independent skills ˆπθ(ˆa|s, ˆz) on training dataset and validation dataset. [...] It is observed that the proposed strategy sampling from the key subset S performs better, where the loss converges to 10 8 on both training and validation sets, which indicates the validity of observation and action collection in characterizing the primitive skills π(a|s, z). |
| Hardware Specification | No | The paper discusses simulated environments like 'Half Cheetah-v3' and 'Half Cheetah-v2', but it does not specify any hardware components (e.g., CPU, GPU models, memory) used for running the experiments or training the models. |
| Software Dependencies | No | The paper mentions algorithms and frameworks like 'SAC' and 'DIAYN' but does not provide specific software versions for any libraries or dependencies. |
| Experiment Setup | Yes | In the pre-training stage, we employ DIAYN to learn 6 primitive skills (|Z| = 6) [...] we consider to learn 3 independent skills (| ˆZ| = 3) for the generation of practical skills. [...] Furthermore, as the trajectory length T increases from 150 to 250, a notable improvement can be obtained. Hence, we keep T = 250 in the following experiments. |