Influencing Long-Term Behavior in Multiagent Reinforcement Learning
Authors: Dong-Ki Kim, Matthew Riemer, Miao Liu, Jakob Foerster, Michael Everett, Chuangchuang Sun, Gerald Tesauro, Jonathan P. How
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Comprehensive evaluation of our approach (Section 4). We demonstrate that our method consistently converges to a more desirable limiting distribution than baseline methods that either neglect the learning of others [14] or consider their learning with a myopic perspective [8] in various multiagent benchmark domains. |
| Researcher Affiliation | Collaboration | 1MIT-LIDS 2IBM-Research 3MIT-IBM Watson AI Lab 4Mila 5University of Oxford |
| Pseudocode | Yes | We provide further details, including implementation for k>1 and psuedocode, in Appendix E. |
| Open Source Code | Yes | The code is available at https: //bit.ly/3f XAr Ao, and video highlights are available at https://bit.ly/37IWeb9. |
| Open Datasets | No | The paper mentions benchmark domains like Bach/Stravinsky, Coordination, Matching Pennies, Mu Jo Co Robo Sumo, and MAgent Battle. However, it does not provide concrete access information (links, DOIs, specific repository names, or formal citations with author and year) for any publicly available datasets used for training within these environments. |
| Dataset Splits | Yes | 3. (b) Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] See Appendix G. |
| Hardware Specification | Yes | All experiments are run on a machine with 32 Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz cores and 8 NVIDIA Tesla V100 32GB GPUs. |
| Software Dependencies | No | The paper refers to general frameworks and algorithms like 'soft actor-critic [27]' and 'variational inference [28]' but does not provide specific version numbers for any software dependencies (e.g., Python, PyTorch, TensorFlow, scikit-learn, etc.) needed for reproduction. |
| Experiment Setup | Yes | We refer to Appendix G for experimental details and hyperparameters. (...) 3. (b) Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] See Appendix G. |