reproducibilityindex.ai

Sample and Communication-Efficient Decentralized Actor-Critic Algorithms with Finite-Time Analysis

Authors: Ziyi Chen, Yi Zhou, Rong-Rong Chen, Shaofeng Zou

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Numerical experiments demonstrate that the proposed algorithms achieve lower sample and communication complexities than the existing decentralized AC algorithms.
Researcher Affiliation	Academia	1Department of Electrical and Computer Engineering, University of Utah. 2Department of Electrical Engineering, University at Buffalo.
Pseudocode	Yes	Algorithm 1 Decentralized Actor-Critic; Algorithm 2 Decentralized TD (critic update); Algorithm 3 Decentralized Natural Actor-Critic
Open Source Code	No	The paper does not include an unambiguous statement or link indicating the release of source code for the methodology described.
Open Datasets	No	The paper describes experiments in simulated environments (e.g., "decentralized ring network", "fully connected network", "two-agent Cliff Navigation environment") rather than using a publicly available dataset with a specific link or citation.
Dataset Splits	No	The paper describes experiments in simulated environments and evaluates performance over iterations, but it does not specify explicit train/validation/test dataset splits typical for supervised learning tasks.
Hardware Specification	No	The paper describes the simulation setup and hyperparameters but does not provide any specific details about the hardware (e.g., GPU/CPU models) used to run the experiments.
Software Dependencies	No	The paper does not list specific software dependencies with version numbers, such as programming languages, libraries, or frameworks used for implementation.
Experiment Setup	Yes	For our Algorithm 1, we choose T = 500, Tc = 50, T c = 10, Nc = 10, T = Tz = 5, β = 0.5, {σm}6 m=1 = 0.1, and consider batch size choices N = 100, 500, 2000. Algorithm 3 uses the same hyperparameters as those of Algorithm 1 except that T = 2000 in Algorithm 3. We select α = 10, 50, 200 for Algorithm 1 with N = 100, 500, 2000 respectively, and Tz = 5, α = 0.1, 0.5, 2, η = 0.04, 0.2, 0.8, K = 50, 100, 200, Nk 2, 5, 10 for Algorithm 3 with N = 100, 500, 2000, respectively.