PaCo: Parameter-Compositional Multi-task Reinforcement Learning
Authors: Lingfeng Sun, Haichao Zhang, Wei Xu, Masayoshi TOMIZUKA
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We now empirically test the performance of our Parameter-Compositional Multi-Task RL framework on the Meta-World benchmark [36]. |
| Researcher Affiliation | Collaboration | 1University of California Berkeley 2Horizon Robotics |
| Pseudocode | Yes | Algorithm 1 Parameter-Compositional MTRL (Pa Co)5 |
| Open Source Code | No | An anonymized placeholder URL is included in paper. Will replace it with a de-anonymized URL and release code at a future time point. |
| Open Datasets | Yes | We now empirically test the performance of our Parameter-Compositional Multi-Task RL framework on the Meta-World benchmark [36]. |
| Dataset Splits | Yes | For training on MT10-rand, we follow the settings introduced in [24] and use i) 10 parallel environments, ii) 20 million environment steps for the 10 tasks together (2 million per task), iii) repeated training with 10 different random seeds for each method. |
| Hardware Specification | Yes | We provided details on computational resources in the supplementary file (Appendix Section 2.3), due to limited space in the main paper. |
| Software Dependencies | No | The paper mentions using 'Soft Actor-Critic (SAC) [9]' but does not provide specific version numbers for SAC or any other software libraries (e.g., PyTorch, TensorFlow, etc.) used for implementation. |
| Experiment Setup | Yes | For training on MT10-rand, we follow the settings introduced in [24] and use i) 10 parallel environments, ii) 20 million environment steps for the 10 tasks together (2 million per task), iii) repeated training with 10 different random seeds for each method. |