Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Sample and Communication-Efficient Decentralized Actor-Critic Algorithms with Finite-Time Analysis
Authors: Ziyi Chen, Yi Zhou, Rong-Rong Chen, Shaofeng Zou
ICML 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical experiments demonstrate that the proposed algorithms achieve lower sample and communication complexities than the existing decentralized AC algorithms. |
| Researcher Affiliation | Academia | 1Department of Electrical and Computer Engineering, University of Utah. 2Department of Electrical Engineering, University at Buffalo. |
| Pseudocode | Yes | Algorithm 1 Decentralized Actor-Critic; Algorithm 2 Decentralized TD (critic update); Algorithm 3 Decentralized Natural Actor-Critic |
| Open Source Code | No | The paper does not include an unambiguous statement or link indicating the release of source code for the methodology described. |
| Open Datasets | No | The paper describes experiments in simulated environments (e.g., "decentralized ring network", "fully connected network", "two-agent Cliff Navigation environment") rather than using a publicly available dataset with a specific link or citation. |
| Dataset Splits | No | The paper describes experiments in simulated environments and evaluates performance over iterations, but it does not specify explicit train/validation/test dataset splits typical for supervised learning tasks. |
| Hardware Specification | No | The paper describes the simulation setup and hyperparameters but does not provide any specific details about the hardware (e.g., GPU/CPU models) used to run the experiments. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers, such as programming languages, libraries, or frameworks used for implementation. |
| Experiment Setup | Yes | For our Algorithm 1, we choose T = 500, Tc = 50, T c = 10, Nc = 10, T = Tz = 5, β = 0.5, {σm}6 m=1 = 0.1, and consider batch size choices N = 100, 500, 2000. Algorithm 3 uses the same hyperparameters as those of Algorithm 1 except that T = 2000 in Algorithm 3. We select α = 10, 50, 200 for Algorithm 1 with N = 100, 500, 2000 respectively, and Tz = 5, α = 0.1, 0.5, 2, η = 0.04, 0.2, 0.8, K = 50, 100, 200, Nk 2, 5, 10 for Algorithm 3 with N = 100, 500, 2000, respectively. |