reproducibilityindex.ai

Pessimism Meets Invariance: Provably Efficient Offline Mean-Field Multi-Agent RL

Authors: Minshuo Chen, Yan Li, Ethan Wang, Zhuoran Yang, Zhaoran Wang, Tuo Zhao

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Numerical experiments are provided.We perform experiments on the multi-agent particle environment (MPE, Lowe et al. (2017)), a popular benchmark used in prior work (Mordatch and Abbeel, 2018; Liu et al., 2020a).
Researcher Affiliation	Academia	Minshuo Chen1 Yan Li1 Ethan Wang1 Zhuoran Yang2 Zhaoran Wang3 Tuo Zhao1 1Georgia Tech 2University of California, Berkeley 3Northwestern University
Pseudocode	Yes	Algorithm 1 Pessimistic Mean-Field Value Iteration (SAFARI)
Open Source Code	No	Sample code is also available at (followed by an empty link in the PDF). Extension to online setting is provided in a longer technical report version, which is available upon request.
Open Datasets	Yes	We perform experiments on the multi-agent particle environment (MPE, Lowe et al. (2017)), a popular benchmark used in prior work (Mordatch and Abbeel, 2018; Liu et al., 2020a).
Dataset Splits	No	No explicit statement of train/validation/test dataset splits was found. The paper mentions using “n = 500 sample episodes of experience data” for training, but does not detail how this data is partitioned for validation purposes or if there’s a specific validation set.
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory specifications) used for experiments were provided.
Software Dependencies	No	No specific software dependencies with version numbers (e.g., library versions, programming language versions) were provided.
Experiment Setup	Yes	Both the policy and critic networks are implemented as traditional MLPs, with 64 and 512 nodes in a single hidden layer, respectively, and we use parameter sharing for policy networks.