Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Single-Loop Federated Actor-Critic across Heterogeneous Environments

Authors: Ye Zhu, Xiaowen Gong

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the performance of SFAC through numerical experiments using common RL benchmarks, which demonstrate its effectiveness. ... We test the SFAC algorithm in the Lunar Lander environment provided by Open AI Gym and the codes were running using T4 Tensor Core GPUs. We evaluate SFAC against A3C (Shen et al. 2023). ... Figure 1: SFAC Performance in Comparison to A3C
Researcher Affiliation Academia Ye Zhu, Xiaowen Gong Auburn University, Auburn, AL, USA EMAIL
Pseudocode Yes Algorithm 1: Single-Loop Federated Actor Critic (SFAC) ... Algorithm 2: Federated Critic (Fed C) ... Algorithm 3: Federated Actor (Fed A)
Open Source Code No The paper does not provide a specific link to a code repository or an explicit statement about open-sourcing the code for the described methodology.
Open Datasets Yes We test the SFAC algorithm in the Lunar Lander environment provided by Open AI Gym
Dataset Splits No The paper uses the Lunar Lander environment, which is a simulation, and provides a sample size for actors (20), but does not specify explicit training/test/validation dataset splits as typically found with static datasets. The concept of splits is less directly applicable in reinforcement learning environments where data is generated through interaction.
Hardware Specification Yes the codes were running using T4 Tensor Core GPUs
Software Dependencies No The paper mentions 'Open AI Gym' as the environment and 'multilayer perceptrons' with 'Relu functions' and 'softmax function' for models, but does not provide specific version numbers for any software libraries (e.g., PyTorch, TensorFlow, Python version) or tools.
Experiment Setup Yes The stepsizes are set to αk = 10^-4 0.99^k and βk = 10^-4 0.99^k. The discount factor is 0.99. To measure the average performance, we collect the return for each round and use the average of the latest 10 rounds for all agents as the average return during training. In each inner round, the number of local updating steps is set to T = 10. The sample size of actors is set to 20.