reproducibilityindex.ai

Influencing Long-Term Behavior in Multiagent Reinforcement Learning

Authors: Dong-Ki Kim, Matthew Riemer, Miao Liu, Jakob Foerster, Michael Everett, Chuangchuang Sun, Gerald Tesauro, Jonathan P. How

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Comprehensive evaluation of our approach (Section 4). We demonstrate that our method consistently converges to a more desirable limiting distribution than baseline methods that either neglect the learning of others [14] or consider their learning with a myopic perspective [8] in various multiagent benchmark domains.
Researcher Affiliation	Collaboration	1MIT-LIDS 2IBM-Research 3MIT-IBM Watson AI Lab 4Mila 5University of Oxford
Pseudocode	Yes	We provide further details, including implementation for k>1 and psuedocode, in Appendix E.
Open Source Code	Yes	The code is available at https: //bit.ly/3f XAr Ao, and video highlights are available at https://bit.ly/37IWeb9.
Open Datasets	No	The paper mentions benchmark domains like Bach/Stravinsky, Coordination, Matching Pennies, Mu Jo Co Robo Sumo, and MAgent Battle. However, it does not provide concrete access information (links, DOIs, specific repository names, or formal citations with author and year) for any publicly available datasets used for training within these environments.
Dataset Splits	Yes	3. (b) Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] See Appendix G.
Hardware Specification	Yes	All experiments are run on a machine with 32 Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz cores and 8 NVIDIA Tesla V100 32GB GPUs.
Software Dependencies	No	The paper refers to general frameworks and algorithms like 'soft actor-critic [27]' and 'variational inference [28]' but does not provide specific version numbers for any software dependencies (e.g., Python, PyTorch, TensorFlow, scikit-learn, etc.) needed for reproduction.
Experiment Setup	Yes	We refer to Appendix G for experimental details and hyperparameters. (...) 3. (b) Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] See Appendix G.