Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Single-Loop Federated Actor-Critic across Heterogeneous Environments
Authors: Ye Zhu, Xiaowen Gong
AAAI 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the performance of SFAC through numerical experiments using common RL benchmarks, which demonstrate its effectiveness. ... We test the SFAC algorithm in the Lunar Lander environment provided by Open AI Gym and the codes were running using T4 Tensor Core GPUs. We evaluate SFAC against A3C (Shen et al. 2023). ... Figure 1: SFAC Performance in Comparison to A3C |
| Researcher Affiliation | Academia | Ye Zhu, Xiaowen Gong Auburn University, Auburn, AL, USA EMAIL |
| Pseudocode | Yes | Algorithm 1: Single-Loop Federated Actor Critic (SFAC) ... Algorithm 2: Federated Critic (Fed C) ... Algorithm 3: Federated Actor (Fed A) |
| Open Source Code | No | The paper does not provide a specific link to a code repository or an explicit statement about open-sourcing the code for the described methodology. |
| Open Datasets | Yes | We test the SFAC algorithm in the Lunar Lander environment provided by Open AI Gym |
| Dataset Splits | No | The paper uses the Lunar Lander environment, which is a simulation, and provides a sample size for actors (20), but does not specify explicit training/test/validation dataset splits as typically found with static datasets. The concept of splits is less directly applicable in reinforcement learning environments where data is generated through interaction. |
| Hardware Specification | Yes | the codes were running using T4 Tensor Core GPUs |
| Software Dependencies | No | The paper mentions 'Open AI Gym' as the environment and 'multilayer perceptrons' with 'Relu functions' and 'softmax function' for models, but does not provide specific version numbers for any software libraries (e.g., PyTorch, TensorFlow, Python version) or tools. |
| Experiment Setup | Yes | The stepsizes are set to αk = 10^-4 0.99^k and βk = 10^-4 0.99^k. The discount factor is 0.99. To measure the average performance, we collect the return for each round and use the average of the latest 10 rounds for all agents as the average return during training. In each inner round, the number of local updating steps is set to T = 10. The sample size of actors is set to 20. |