reproducibilityindex.ai

Meta Reinforcement Learning with Task Embedding and Shared Policy

Authors: Lin Lan, Zhenguo Li, Xiaohong Guan, Pinghui Wang

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical results1 on four simulated tasks demonstrate that our method has better learning capacity on both training and novel tasks and attains up to 3 to 4 times higher returns compared to baselines.
Researcher Affiliation	Collaboration	Lin Lan1 , Zhenguo Li2 , Xiaohong Guan1,3,4 and Pinghui Wang3,1 1MOE NSKEY Lab, Xi an Jiaotong University, China 2Huawei Noah s Ark Lab 3Shenzhen Research School, Xi an Jiaotong University, China 4Department of Automation and NLIST Lab, Tsinghua University, China llan@sei.xjtu.edu.cn, li.zhenguo@huawei.com, {xhguan, phwang}@mail.xjtu.edu.cn
Pseudocode	Yes	Algorithm 1 Training Procedure of TESP
Open Source Code	Yes	1Code available at https://github.com/llan-ml/tesp.
Open Datasets	No	The paper states tasks are sampled within the MuJoCo simulator, and refers to generated tasks ("we sample 100 target locations... as training tasks D"). It does not provide access information for a pre-existing public dataset.
Dataset Splits	No	The paper discusses training and testing on different sets of sampled tasks (D, D', D'') and performing evaluations, but it does not specify explicit dataset splits (e.g., percentages or counts) for training, validation, and testing as one would for a fixed dataset.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments, only mentioning the use of the MuJoCo simulator.
Software Dependencies	No	The paper mentions software components like MuJoCo simulator, VPG, and PPO, but it does not specify version numbers for these or any other software dependencies.
Experiment Setup	No	The paper mentions general experimental settings such as setting K to 3 and using VPG for fast-update and PPO for meta-update, but it explicitly defers 'detailed settings of environments and experiments' to a supplementary resource at a provided GitHub link, rather than including them in the main text.