reproducibilityindex.ai

Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents

Authors: Kaiqing Zhang, Zhuoran Yang, Han Liu, Tong Zhang, Tamer Basar

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We ﬁrst evaluate our algorithms with linear function approximation... It is shown in Figure 2 that the proposed algorithms successfully converge even with such nonlinear function approximators...
Researcher Affiliation	Collaboration	1Department of Electrical and Computer Engineering & Coordinated Science Laboratory, University of Illinois at Urbana Champaign, USA 2Department of Operations Research and Financial Engineering, Princeton University, USA 3Department of Electrical Engineering and Computer Science and Statistics, Northwestern University, USA 4Tencent AI Lab, China.
Pseudocode	Yes	We refer to the steps (3.3)-(3.5) as Algorithm 1, whose pseudocode is provided in A in the appendix.
Open Source Code	No	The paper does not provide an explicit statement about open-sourcing the code or a link to a code repository for the methodology described.
Open Datasets	Yes	We consider the MARL task of Cooperative Navigation from Lowe et al. (2017). To be compatible with our networked multi-agent MDP, we modify the environment and provide the details in E.2 in the appendix.
Dataset Splits	No	The paper describes experiments in a reinforcement learning environment over 'episodes' but does not specify traditional dataset splits (e.g., 80/10/10) for training, validation, and testing as might be seen in supervised learning contexts.
Hardware Specification	No	The paper does not provide specific details regarding the hardware used for running the experiments (e.g., specific GPU/CPU models, memory, or cloud instance types).
Software Dependencies	No	The paper does not specify version numbers for any software dependencies or libraries used in the implementation or experiments.
Experiment Setup	No	The paper mentions details on the model and environment in the appendix (E.1, E.2) but does not explicitly provide specific hyperparameter values (e.g., learning rate, batch size) or detailed system-level training settings in the main text.