Contrastive Modules with Temporal Attention for Multi-Task Reinforcement Learning
Authors: Siming Lan, Rui Zhang, Qi Yi, Jiaming Guo, Shaohui Peng, Yunkai Gao, Fan Wu, Ruizhi Chen, Zidong Du, Xing Hu, xishan zhang, Ling Li, Yunji Chen
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conducted the experiment on Meta-World, a multi-task RL benchmark containing various robotics manipulation tasks. Experimental results show that CMTA outperforms learning each task individually for the first time and achieves substantial performance improvements over the baselines. |
| Researcher Affiliation | Collaboration | 1 University of Science and Technology of China 2 State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China 3 Cambricon Technologies 4 University of Chinese Academy of Sciences, UCAS, Beijing, China 5 Intelligent Software Research Center, Institute of Software, CAS, Beijing, China 6 Shanghai Innovation Center for Processor Technologies, SHIC, Shanghai, China |
| Pseudocode | Yes | A Pseudo Code Algorithm 1 CMTA Initialize: replay buffer D with Initialize: initial hidden state h0 with zero tensor Initialize: policy π with ϕ, Q-function Q, task encoder g, k experts f 1, , f k, lstm, fully connected layer W Input: state st for each environment, one-hot task id zτ |
| Open Source Code | Yes | Our code can be found at https://github.com/niiceMing/CMTA. |
| Open Datasets | Yes | We evaluate the effectiveness of our CMTA model on Meta-World environment[48], which is a collection of robotic manipulation tasks designed to encourage research in multi-task RL. |
| Dataset Splits | No | The paper describes training and testing procedures, and mentions evaluating the agent during the training phase, but it does not explicitly specify a distinct validation dataset split or the use of a validation set. |
| Hardware Specification | No | The paper does not provide specific hardware details (like exact GPU/CPU models, processor types, or memory amounts) used for running its experiments, only mentioning the simulated environment. |
| Software Dependencies | No | We use the following open-source libraries: Meta World3, MTEnv4,MTRL5[33]. (Footnote 3: https://github.com/rlworkgroup/metaworld, commit-id:af8417bfc82a3e249b4b02156518d775f29eb289) - only one of the three listed libraries has a specific version identifier (commit-id). |
| Experiment Setup | Yes | More implementation details of hyper-parameters can be found in Appendix D. ... Appendix D. Hyperparameter Details. Table 3: Hyperparameter values that are common across all the methods ... Table 7: Hyperparameter values of CMTA |