reproducibilityindex.ai

Entity Divider with Language Grounding in Multi-Agent Reinforcement Learning

Authors: Ziluo Ding, Wanpeng Zhang, Junpeng Yue, Xiangjun Wang, Tiejun Huang, Zongqing Lu

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, En Di demonstrates the strong generalization ability to unseen games with new dynamics and expresses the superiority over existing methods. The code is available at https://github.com/PKU-RL/En Di.
Researcher Affiliation	Collaboration	1School of Computer Science, Peking University 2inspir.ai 3Beijing Academy of Artificial Intelligence.
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	The code is available at https://github.com/PKU-RL/En Di.
Open Datasets	Yes	We present two goal-based multi-agent environments based on two previous single-agent settings, i.e., MESSENGER (Hanjie et al., 2021) and RTFM (Zhong et al., 2019)
Dataset Splits	Yes	We use the validation games to save the model parameters with the highest validation win rate during training and use these parameters to evaluate the models on the test games. Note that the validation procedure follows the same settings of previous work (Hanjie et al., 2021).
Hardware Specification	No	The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running its experiments.
Software Dependencies	No	The paper mentions software components like IMPALA, RMSProp, PPO, Adam optimizer, and BERT-base model, but does not provide specific version numbers for these or other ancillary software dependencies.
Experiment Setup	Yes	We train using an implementation of IMPALA (Espeholt et al., 2018). In particular, we use 10 actors and a batch size of 20. When unrolling actors, we use a maximum unroll length of 80 steps. Each episode lasts for a maximum of 1000 steps. We optimize using RMSProp with a learning rate of 0.005, which is annealed linearly for 100 million steps. We set α = 0.99 and ϵ = 0.01.