Contrastive Modules with Temporal Attention for Multi-Task Reinforcement Learning

Authors: Siming Lan, Rui Zhang, Qi Yi, Jiaming Guo, Shaohui Peng, Yunkai Gao, Fan Wu, Ruizhi Chen, Zidong Du, Xing Hu, xishan zhang, Ling Li, Yunji Chen

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conducted the experiment on Meta-World, a multi-task RL benchmark containing various robotics manipulation tasks. Experimental results show that CMTA outperforms learning each task individually for the first time and achieves substantial performance improvements over the baselines.
Researcher Affiliation Collaboration 1 University of Science and Technology of China 2 State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China 3 Cambricon Technologies 4 University of Chinese Academy of Sciences, UCAS, Beijing, China 5 Intelligent Software Research Center, Institute of Software, CAS, Beijing, China 6 Shanghai Innovation Center for Processor Technologies, SHIC, Shanghai, China
Pseudocode Yes A Pseudo Code Algorithm 1 CMTA Initialize: replay buffer D with Initialize: initial hidden state h0 with zero tensor Initialize: policy π with ϕ, Q-function Q, task encoder g, k experts f 1, , f k, lstm, fully connected layer W Input: state st for each environment, one-hot task id zτ
Open Source Code Yes Our code can be found at https://github.com/niiceMing/CMTA.
Open Datasets Yes We evaluate the effectiveness of our CMTA model on Meta-World environment[48], which is a collection of robotic manipulation tasks designed to encourage research in multi-task RL.
Dataset Splits No The paper describes training and testing procedures, and mentions evaluating the agent during the training phase, but it does not explicitly specify a distinct validation dataset split or the use of a validation set.
Hardware Specification No The paper does not provide specific hardware details (like exact GPU/CPU models, processor types, or memory amounts) used for running its experiments, only mentioning the simulated environment.
Software Dependencies No We use the following open-source libraries: Meta World3, MTEnv4,MTRL5[33]. (Footnote 3: https://github.com/rlworkgroup/metaworld, commit-id:af8417bfc82a3e249b4b02156518d775f29eb289) - only one of the three listed libraries has a specific version identifier (commit-id).
Experiment Setup Yes More implementation details of hyper-parameters can be found in Appendix D. ... Appendix D. Hyperparameter Details. Table 3: Hyperparameter values that are common across all the methods ... Table 7: Hyperparameter values of CMTA