reproducibilityindex.ai

Iterated Reasoning with Mutual Information in Cooperative and Byzantine Decentralized Teaming

Authors: Sachin G Konan, Esmaeil Seraj, Matthew Gombolay

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments validate the utility of Info PG by achieving higher sample efﬁciency and signiﬁcantly larger cumulative reward in several complex cooperative multi-agent domains.
Researcher Affiliation	Academia	Sachin Konan , Esmaeil Seraj , Matthew Gombolay Georgia Institute of Technology Atlanta, GA 30332, USA {skonan, eseraj3}@gatech.edu, matthew.gombolay@cc.gatech.edu
Pseudocode	Yes	Please refer to Appendix, Section A.1 for pseudocode and details of our training and execution procedures. [...] Algorithm 1: Training the Mutual Information Maximizing Policy Gradient (Info PG).
Open Source Code	Yes	We also publicized our source code in a public repository, available online at https://github.com/CORE-Robotics-Lab/Info PG.
Open Datasets	Yes	Our testing environments include: (1) Cooperative Pong (Co-op Pong) (Terry et al., 2020), (2) Pistonball (Terry et al., 2020), (3) Multiwalker (Gupta et al., 2017; Terry et al., 2020) and, (4) Star Craft II (Vinyals et al., 2017), i.e., the 3M (three marines vs. three marines) challenge. [...] Domains are parts of the Petting Zoo (Terry et al., 2020) MARL research library and can be accessed online at https://www.pettingzoo.ml/envs. The Star Craft II (Vinyals et al., 2017), can be accessed from Deepmind s repository available online at https://github.com/deepmind/pysc2.
Dataset Splits	No	The paper discusses training and testing, but does not explicitly mention using a separate validation set or specific training/validation/test splits with percentages or counts.
Hardware Specification	Yes	Hardware Speciﬁcs All experiments were conducted on an NVIDIA Quadro RTX 8000 with approximately 50 GB of Video Memory Capacity.
Software Dependencies	No	The paper discusses the use of Alex Net and specific RNN types (GRU, LSTM, VRNN) but does not provide version numbers for any software dependencies.
Experiment Setup	Yes	Additionally, we have provided the details of our implementations for training and execution as well as the full hyperparameter lists for all methods, baselines, and experiments in the Appendix, Section A.9. [...] Tables 2-6 provide detailed hyperparameters such as Learning Rate, Batch Size, Discount Factor, etc.