Distributed Multitask Reinforcement Learning with Quadratic Convergence
Authors: Rasul Tutunov, Dongho Kim, Haitham Bou Ammar
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We analyse the performance of our method both theoretically and empirically. On the theory side, we formally prove quadratic convergence. On the empirical side, we show that our new technique outperforms state-of-the-art methods from both distributed optimisation and lifelong reinforcement learning on a variety of graph topologies. |
| Researcher Affiliation | Industry | Rasul Tutunov PROWLER.io Cambridge, United Kingdom rasul@prowler.io Dongho Kim PROWLER.io Cambridge, United Kingdom dongho@prowler.io Haitham Bou-Ammar PROWLER.io Cambridge, United Kingdom haitham@prowler.io |
| Pseudocode | No | The paper describes its solution steps in text but does not include a formally labeled 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | No | The paper does not provide an explicit statement or link for open-source code related to the methodology described. |
| Open Datasets | Yes | Our experiments ran on five systems, simple mass (SM), double mass (DM), cart-pole (CP), helicopter (HC), and humanoid robots (HR). We followed the experimental protocol in [10, 33] where we generated 5000 SM, 500 DM, and 1000 CP tasks by varying the dynamical parameters of each of the above systems. |
| Dataset Splits | No | The paper describes generating and distributing tasks but does not specify explicit training, validation, or test dataset splits (e.g., percentages or counts) or reference standard predefined splits for these tasks. |
| Hardware Specification | No | The paper mentions using 'MATLAB s parallel pool running on 10 nodes' but does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for the experiments. |
| Software Dependencies | No | The paper mentions the use of 'MATLAB' but does not provide specific version numbers for MATLAB or any other software dependencies required to replicate the experiments. |
| Experiment Setup | Yes | An ϵ = 1/100 was provided to the Chebyshev solver for determining the approximate Newton direction in all cases. Step-sizes were determined separately for each algorithm using a grid-search-like technique over {0.01, . . . , 1} to ensure best operating conditions. |