Multi-Task Reinforcement Learning with Mixture of Orthogonal Experts
Authors: Ahmed Hendawy, Jan Peters, Carlo D'Eramo
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this paper, we introduce a novel approach for representation learning in MTRL that encapsulates common structures among the tasks using orthogonal representations to promote diversity. Our method, named Mixture Of Orthogonal Experts (MOORE), leverages a Gram-Schmidt process to shape a shared subspace of representations generated by a mixture of experts. When task-specific information is provided, MOORE generates relevant representations from this shared subspace. We assess the effectiveness of our approach on two MTRL benchmarks, namely Mini Grid and Meta World, showing that MOORE surpasses related baselines and establishes a new state-of-the-art result on Meta World.1 |
| Researcher Affiliation | Academia | Ahmed Hendawy1,2 , Jan Peters1,2,3,4 , Carlo D Eramo1,2,5 1Department of Computer Science, TU Darmstadt, Germany 2Hessian Center for Artificial Intelligence (Hessian.ai), Germany 3Center for Cognitive Science, TU Darmstadt, Germany 4German Research Center for AI (DFKI), Systems AI for Robot Learning 5Center for Artificial Intelligence and Data Science, University of W urzburg, Germany |
| Pseudocode | Yes | Algorithm 1 MOORE for Actor Require: Mixture of experts hϕ, state s, context c, task-specific weights wc, output function fθ. 1: Us = hϕ(s) 2: Vs = GS(Us) Apply Eq. 2 3: vc = Vs wc 4: a fθ(vc) 5: Return: a |
| Open Source Code | Yes | 1The code is available at https://github.com/Ahmed Magdy Hendawy/MOORE. |
| Open Datasets | Yes | We consider different tasks in Mini Grid (Chevalier-Boisvert et al., 2023), a suite of 2D goal-oriented environments that requires solving different mazes while interacting with objects like doors, keys, or boxes of several colors, shapes, and roles. Mini Grid offers a visual representation of the state, which we adopt for our multi-task setting. ... We benchmark against Meta World (Yu et al., 2019), a widely adopted robotic manipulation benchmark for Multi-Task and Meta Reinforcement Learning. |
| Dataset Splits | No | No explicit training/validation/test dataset splits are provided, beyond general statements about using standard benchmarks. |
| Hardware Specification | No | No specific hardware details (e.g., GPU models, CPU models, memory) are provided. It only mentions 'Calculations for this research were conducted on the Lichtenberg high-performance computer of the TU Darmstadt and the Intelligent Autonomous Systems (IAS) cluster at TU Darmstadt.' |
| Software Dependencies | No | The paper mentions using 'Mushroom-RL (D Eramo et al., 2021) as the RL library' but does not specify a version number for Mushroom-RL or any other critical software dependencies. |
| Experiment Setup | Yes | In Tab. 4, we highlight the important hyperparameters needed to reproduce the results on Mini Grid. ... In Tab. 5, we illustrate the hyperparameters of both the representation block and the output head. ... In Tab. 6, we list the hyperparameters required for reproducing our results on Meta World. ... In Tab. 7, we list the hyperparameters for Actor and Critic Architecture for SAC. |