reproducibilityindex.ai

Multi-Agent Reinforcement Learning via Double Averaging Primal-Dual Optimization

Authors: Hoi-To Wai, Zhuoran Yang, Zhaoran Wang, Mingyi Hong

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In 4 we illustrate the empirical performance of the proposed algorithm. [...] To verify the performance of our proposed method, we conduct an experiment on the mountaincar dataset [46] under a setting similar to [15] to collect the dataset, we ran Sarsa with d = 300 features to obtain the policy, then we generate the trajectories of actions and states according to the policy with M samples.
Researcher Affiliation	Academia	Hoi-To Wai The Chinese University of Hong Kong Shatin, Hong Kong htwai@se.cuhk.edu.hk Zhuoran Yang Princeton University Princeton, NJ, USA zy6@princeton.edu Zhaoran Wang Northwestern University Evanston, IL, USA zhaoranwang@gmail.com Mingyi Hong University of Minnesota Minneapolis, MN, USA mhong@umn.edu
Pseudocode	Yes	Algorithm 1 PD-Dist IAG Method for Multi-agent, Primal-dual, Finite-sum Optimization
Open Source Code	No	The paper does not contain any concrete access information (e.g., a specific repository link, an explicit code release statement, or mention of code in supplementary materials) for the source code of the methodology.
Open Datasets	Yes	To verify the performance of our proposed method, we conduct an experiment on the mountaincar dataset [46] under a setting similar to [15] to collect the dataset
Dataset Splits	No	The paper mentions 'M = 5000 samples' but does not specify the train/validation/test dataset splits (e.g., percentages, sample counts for each split, or reference to predefined splits).
Hardware Specification	No	The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper mentions using Sarsa and comparing with PDBG, GTD2, and SAGA, but it does not specify version numbers for any software dependencies or libraries required to replicate the experiments.
Experiment Setup	Yes	For PD-Dist IAG, we simulate a communication network with N = 10 agents, connected on an Erdos-Renyi graph generated with connectivity of 0.2; for the step sizes, we set γ1 = 0.005/λmax( ˆ A), γ2 = 5 10 3. For this problem, we have d = 300, M = 5000 samples, and there are N = 10 agents.