reproducibilityindex.ai

Finite-Time Analysis of On-Policy Heterogeneous Federated Reinforcement Learning

Authors: Chenyu Zhang, Han Wang, Aritra Mitra, James Anderson

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In Figure 1, we plot the mean squared error averaged over ten runs for different heterogeneity levels and numbers of agents. The simulation results are consistent with Corollary 2.1 and demonstrate the robustness of our method towards environmental heterogeneity.
Researcher Affiliation	Academia	Chenyu Zhang Data Science Institute Columbia University New York, NY 10025, USA cz2736@columbia.edu Han Wang Department of Electrical Engineering Columbia University New York, NY 10025, USA hw2786@columbia.edu Aritra Mitra Department of Electrical and Computer Engineering NC State University Raleigh, NC 27695, USA amitra2@ncsu.edu James Anderson Department of Electrical Engineering Columbia University New York, NY 10025, USA james.anderson@columbia.edu
Pseudocode	Yes	We present Fed SARSA in Algorithm 1. Algorithm 1: Fed SARSA
Open Source Code	No	The paper does not include a statement about releasing open-source code or provide a link to a code repository.
Open Datasets	No	To construct heterogeneous MDPs, we first generate a nominal MDP M1 and obtain the remaining MDPs by adding the perturbations to M1.
Dataset Splits	No	The paper conducts simulations in a generated environment, not on a traditional dataset with explicit train/validation/test splits described for reproduction.
Hardware Specification	No	The paper does not provide any specific details about the hardware used for running the experiments.
Software Dependencies	No	The paper does not specify any software dependencies with version numbers.
Experiment Setup	Yes	We create a finite state space of size \|S\| = 100, an action space of \|A\| = 100, a feature space of dimension d = 25, and set γ = 0.2 and R = 10. The actions determine the transition matrices by shifting the columns of a reference matrix. The synchronization period is set to K = 10, and the step-size of α0 = 0.01. For the full experiment setup, please refer to Appendix C.