reproducibilityindex.ai

FOP: Factorizing Optimal Joint Policy of Maximum-Entropy Multi-Agent Reinforcement Learning

Authors: Tianhao Zhang, Yueheng Li, Chen Wang, Guangming Xie, Zongqing Lu

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, in the well-known matrix game and differential game, we verify that FOP can converge to the global optimum for both discrete and continuous action spaces. We also evaluate FOP on a set of Star Craft II micromanagement tasks, and demonstrate that FOP substantially outperforms state-of-the-art decomposed value-based and actor-critic methods.
Researcher Affiliation	Academia	1Peking University.
Pseudocode	Yes	Algorithm 1 FOP
Open Source Code	No	The paper does not provide any explicit statements about open-sourcing the code or links to a code repository.
Open Datasets	Yes	We evaluate FOP on the challenging Star Craft Multi-Agent Challenge (SMAC) benchmark (Samvelyan et al., 2019)
Dataset Splits	No	The paper mentions using the StarCraft II micromanagement tasks benchmark but does not specify the train/validation/test splits used, or whether standard splits from the benchmark were used (e.g., specific percentages or sample counts).
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies	No	The paper does not provide specific version numbers for any software dependencies or libraries used in the experiments.
Experiment Setup	No	The paper describes the FOP architecture and learning objectives but does not provide specific hyperparameter values (e.g., learning rate, batch size, number of training steps for networks, optimizer settings) or other detailed training configurations in the main text.