reproducibilityindex.ai

Agent Modelling under Partial Observability for Deep Reinforcement Learning

Authors: Georgios Papoudakis, Filippos Christianos, Stefano Albrecht

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We provide a comprehensive evaluation and ablations studies in cooperative, competitive and mixed multi-agent environments, showing that our method achieves higher returns than baseline methods which do not use the learned representations.
Researcher Affiliation	Academia	School of Informatics University of Edinburgh {g.papoudakis, f.christianos, s.albrecht}@ed.ac.uk
Pseudocode	Yes	The pseudocode of LIAM is given in Appendix A and the implementation details in Appendix D.
Open Source Code	Yes	We provide an implementation of LIAM in https://github.com/uoe-agents/LIAM
Open Datasets	Yes	We evaluate the proposed method in three multi-agent environments (one cooperative, one mixed, one competitive): double speaker-listener [Mordatch and Abbeel, 2017], level-based foraging [Albrecht and Stone, 2017, Papoudakis et al., 2021], and a version of predator-prey proposed by [Böhmer et al., 2020].
Dataset Splits	No	The paper mentions a 'training set Π' and evaluates on policies from Π, but does not provide specific numerical train/validation/test dataset splits (e.g., 80/10/10 split) or refer to external resources that define these splits.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory specifications) used for running the experiments.
Software Dependencies	No	The paper mentions general software components and algorithms like A2C, Pytorch, Adam, LSTM, and ReLU, but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	No	The paper mentions general training practices such as using different learning rates for RL and encoder-decoder networks, and averaging over five runs with different initial seeds. However, it does not provide specific hyperparameter values (e.g., learning rate values, batch sizes, or number of epochs) within the main text.