Multi-Task Reinforcement Learning with Soft Modularization

Authors: Ruihan Yang, Huazhe Xu, YI WU, Xiaolong Wang

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We experiment with various robotics manipulation tasks in simulation and show our method improves both sample efficiency and performance over strong baselines by a large margin.
Researcher Affiliation Academia 1UC San Diego 2 UC Berkeley 3 IIIS, Tsinghua 4 Shanghai Qi Zhi Institute
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes Our project page with code is at https: //rchalyang.github.io/Soft Module/.
Open Datasets Yes We evaluate our approach with the recent proposed Meta-World [43] environment.
Dataset Splits No The paper does not provide specific dataset split information (exact percentages, sample counts, or detailed splitting methodology) for training, validation, or test sets. It mentions using MT10 and MT50 challenges but not explicit splits within them.
Hardware Specification No The paper mentions the Mu Jo Co environment but does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details with version numbers (e.g., Python 3.8, PyTorch 1.9).
Experiment Setup Yes Ours (Shallow) contains L = 2 module layers, n = 2 modules per layer and each module outputs a d = 256 representation; Ours (Deep) contains L = 4 module layers, n = 4 modules per layer and each module outputs a d = 128 representation. For MT-SAC and MT-MH-SAC baselines, we train them with 20 million samples on the MT10 setting and 100 million samples on the MT50 setting. For our method, Mix-Expert and Hard Routing Baselines, they converge much faster, and we train it with 15 million samples for MT10 and with 50 million samples for MT50 tasks.