reproducibilityindex.ai

Sharing Experience in Multitask Reinforcement Learning

Authors: Tung-Long Vuong, Do-Van Nguyen, Tai-Long Nguyen, Cong-Minh Bui, Hai-Dang Kieu, Viet-Cuong Ta, Quoc-Long Tran, Thanh-Ha Le

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The experiments highlight that our framework improves the performance and the stability of learning task-policies, and is possible to help task-policies avoid local optimums.
Researcher Affiliation	Academia	HMI Lab, UET, Vietnam National University, Hanoi, Vietnam
Pseudocode	Yes	Algorithm 1 Sharing-experience framework with Z agent
Open Source Code	No	The paper does not provide any explicit statement or link regarding the availability of open-source code for the described methodology.
Open Datasets	No	The paper uses custom-designed 'multi-task gridworld environments' (Section 4) described as 'Multiple Goals Well-Gated Grid-World' and 'Multiple Goals Grid-World without Wall'. No links, DOIs, repositories, or formal citations to publicly available datasets are provided.
Dataset Splits	No	The paper describes experiments in custom gridworld environments where agents interact directly. It lists 'Number of rollouts each Inter.' and 'Rollout length' in Table 1, which are parameters for data generation in reinforcement learning, but it does not specify traditional training/validation/test dataset splits with percentages or sample counts for reproduction.
Hardware Specification	No	The paper does not provide any specific details about the hardware used for running the experiments (e.g., GPU models, CPU types, memory, or cloud instance specifications).
Software Dependencies	No	The paper mentions various algorithms and architectures like 'Q-learning', 'SARSA', 'CNN', 'LSTM', 'DNN', 'DRL', 'PPO algorithm', and 'basic advantage-actor-critic'. However, it does not provide specific software dependencies with version numbers (e.g., 'Python 3.8', 'PyTorch 1.9', 'CUDA 11.1') needed for replication.
Experiment Setup	Yes	The paper provides a detailed experimental setup in 'Table 1: Hyper-parameters Setting' which lists parameters such as 'Discounted factor 0.99', 'Number of Iterations 1000', 'Learning rate actor-critic 0.005', 'Optimizer RMSProp', and 'Table 2: Network structures' which specifies the architecture for Policy, Value, and Z networks including layer sizes.