reproducibilityindex.ai

Fault-Tolerant Federated Reinforcement Learning with Theoretical Guarantee

Authors: Xiaofeng Fan, Yining Ma, Zhongxiang Dai, Wei Jing, Cheston Tan, Bryan Kian Hsiang Low

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	All theoretical results are empirically veriﬁed on various RL benchmark tasks. We also demonstrate its empirical efﬁcacy on various RL benchmark tasks (Section 5).
Researcher Affiliation	Collaboration	1Dept. of Computer Science, National University of Singapore, Republic of Singapore 2Dept. of ISEM, National University of Singapore, Republic of Singapore 3Institute for Infocomm Research, A*STAR, Republic of Singapore 4Alibaba DAMO Academy, Hangzhou, China
Pseudocode	Yes	Algorithm 1 Fed PG-BR; Algorithm 1.1 Fed PG-Aggregate
Open Source Code	Yes	The code and instructions to reproduce the results are given in our github repository: https://github.com/flint-xf-fan/Byzantine-Federeated-RL
Open Datasets	Yes	We evaluate the empirical performances of Fed PG-BR with and without Byzantine agents on different RL benchmarks, including Cart Pole balancing [55], Lunar Lander, and the 3D continuous locomotion control task of Half-Cheetah [56].
Dataset Splits	No	The paper describes how policies are evaluated through interaction with the environment, but does not specify traditional training/validation/test dataset splits as would be common in supervised learning. RL often involves continuous interaction rather than static data splits for validation.
Hardware Specification	Yes	All experiments are conducted on an internal cluster of machines with Intel(R) Xeon(R) Gold 6130 CPUs (2.10GHz, 16 cores), 192GB RAM, and NVIDIA V100 GPUs.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers for their implementation (e.g., Python, PyTorch versions).
Experiment Setup	Yes	if we choose ηt 1 2ΨB2/3 t , bt = 1, and Bt = B 4ΦL 2 where Φ Lg + C2 g Cw, Ψ (L(Lg + C2 g Cw))1/3...; For CartPole-v1, the episode horizon is set to 200, the server batch size Bt is set to 1000, and the server mini batch size bt is set to 10. The number of training iterations T is set to 5000.