Federated Reinforcement Learning: Linear Speedup Under Markovian Sampling
Authors: Sajad Khodadadian, Pranay Sharma, Gauri Joshi, Siva Theja Maguluri
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We propose federated versions of on-policy TD, off-policy TD and Q-learning, and analyze their convergence. For all these algorithms, to the best of our knowledge, we are the first to consider Markovian noise and multiple local updates, and prove a linear convergence speedup with respect to the number of agents. |
| Researcher Affiliation | Academia | 1H. Milton Stewart School of Industrial & Systems Engineering, Georgia Institute of Technology, Atlanta, GA, 30332, USA 2Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA, 15213, USA |
| Pseudocode | Yes | Algorithm 1 Federated n-step TD (On-policy, Function Approx.), Algorithm 2 Federated n-step TD (Off-policy Tabular Setting), Algorithm 3 Federated Q-learning, Algorithm 4 Federated Stochastic Approximation with Markovian Noise (Fed SAM) |
| Open Source Code | No | The paper does not provide any statement or link regarding the availability of open-source code for the methodology described. |
| Open Datasets | No | The paper is theoretical and focuses on mathematical analysis and proofs, thus it does not use or specify any public or open datasets for training or evaluation. |
| Dataset Splits | No | This paper is purely theoretical and does not involve experimental validation with datasets, so no training/validation/test splits are specified. |
| Hardware Specification | No | This paper is theoretical and does not describe empirical experiments, therefore no specific hardware specifications for running experiments are provided. |
| Software Dependencies | No | This paper is theoretical and does not describe empirical experiments, therefore no specific software dependencies with version numbers are listed. |
| Experiment Setup | No | This paper is theoretical and focuses on mathematical analysis and proofs rather than empirical experiments, so it does not include details on experimental setup or hyperparameters. |